Recommendation Engine for FF.net

#1
Hey all,

I'm a web developer by day, and I was becoming frustrated with navigating long lists in forums, and long filters on ff.net trying to find some new stuff.

So, I came up with an idea.

Lots of people favorite a story. As an author, you get to see people who have favorited one of your stories, but what if there was a tool that let you backreference favorites and then suggest stories based on people who have favorited similar stories?

For example: say I wanted to find suggestions for stories similar to my own fic... <a href='http://www.fanfiction.net/s/5645536/1/The_Day_Harry_Potter_Died' target='_blank' rel='nofollow'>The Day Harry Potter Died</a>. What if it would look at a list, see who had favorited THAT story, then gave me a list of the most common OTHER favorites among those people?

That's what I've been doing. Unfortunately, this is all unofficial, which means I don't have access to the ff.net database, which means I wrote a spider that went out and indexed over 1,500 users' favorites (over 370,000). This list is obviously far from complete, and while it will have fics from all fandoms, its Naruto/HP heavy as that's where I started. People have been adding their own lists, so we're up over 2,300 people and 500,000 favorites now.

So why am I posting this here?

I would like to get your feedback.

The tool can be found <a href='http://fanfictionrecs.net/favs_test.php' target='_blank' rel='nofollow'>here</a>... You will need to story id number for your reference story. Using my example, I'd put in the ID for The Day Harry Potter Died (5645536) into the box and hit go, which would give me <a href='http://fanfictionrecs.net/favs_test.php?storyid=5645536' target='_blank' rel='nofollow'>this list</a>.

Let me know what you think.

You can even add profiles for indexing <a href='http://fanfictionrecs.net/suggest_user.php' target='_blank' rel='nofollow'>here</a> (when my server feels like cooperating). You can view stats <a href='http://fanfictionrecs.net/ff_stats.php' target='_blank' rel='nofollow'>here</a>.
 
#2
The engine NEEDS more entries. But, I really like the idea and have bookmarked the page.

Thanks for your efforts !!

-myrubbish2001
 

PCHeintz72

The Sentient Fanfic Search Engine mk II
#3
Hmmm... Here are some thoughts, not in any particular order... feel free to disagree/ignore or agree with.

Interesting concept and program... but, it may be fundamentally flawed.

Note I mean that as no insult intended... I speak from personal experience because of my own Lister program, which while focused on a completely different aspect of linking, does something similar at its core, and I know the flaws in my own design quite well after 5+ years of using it and adding/rewriting portions of it.

Unless I have pegged this wrong, it sounds like you are using static linked lists in a pseudo list processor back end and search engine UI environment.

Basically, you download or otherwise acquire using your spider program... the raw lists or pages of these lists from the authors supported.

But... what if a author adds after you update? or removes after you update? Or if it is a brand new story, it may be good, but not in the favorites lists of others yet.

The fundamental flaws in that design include
1. unless you manually recheck them every now and then or update them with the spider like program, at least on a per author basis... they get out of date and new updates not added.
2. What of stories elsewhere... say Mediaminer.ORG, FicWad.COM, or a author home page, or a series specific site? Harry Potter, Ranma, Buffy the Vampire Slayer, and Evangelion for example have series specific sites.
3. It can only grow... and grow... and grow.
4. What of authors that had placed their favories or site recommendations in their profile description, and not the favorites tab.

Additionally... You are relying on others to properly have fan fiction favorites in their Personal Profile on FanFiction.NET. I can tell you I have exactly 0 entries. There is also the matter of relying on the opinions of others.

In my case, the Lister program I wrote uses my Internet Explorer Favorites list, and is in no way linked directly to fan fiction on-line but to the local list, thus no spider program for on-line purposes is needed. But it *is* linked instead to my downloaded fan fiction directories, and in your terms a spider like process occurs to match favorites to stories to generate the output files.... thus using static linked lists.

The issue for me comes into play where if I download a fan fiction story, but do not place the new names in the correct directories, the program will miss it when I next run it. Likewise, if I do not add a link to a new author or story, of which I had no author link before, the story will not make it into my lists with a link, but will still be listed without one.
 
#4
A couple of notes:

-To use it, you don't need to maintain any favorites lists. It uses data from the people that do.

-The data is stored in a relational database after being scraped from the actual user pages.

-After a profile is added, it is recrawled once a week (in a rolling list) to be reindexed for new favorites and deletions.

-I plan on adding other fanfiction sites, but started with ff.net since it's the largest and easiest to crawl, but thank you for your concern/suggestion. :)

-In the future, I was looking at letting people "claim" profiles, so that they could say "this is my profile on ff.net, and on mediaminer, and on adultfanfiction, and on mugglenet", etc.

By allowing them to do that you could link all those favorites choices together, and also provide an automatic list of suggestions. But we'll see. So far people seem to be pretty content to use it as is.

I'm also looking into other ways of backreferencing. Favorites lists was the easiest way, but I will be examining others.
 

Dartz_IRL

Well-Known Member
#5
Uh. One thing.

There's always going to be a natural bias towards the large fandoms like Harry Potter and Naruto. It's not just a case of the spider starting there... they're the largest fandoms. They're always going to be at the top, both with favourites and reviews count.

Is there a way to make things more relevant to other fandoms, say by adding options to remove certain story fandoms from the results, or include only results from within a single fandom?

If I want, say, more Evangelion fics... I don't want to trawl through a sea of Naruto to find one on page 3.

EDIT: Also, I think the database just shat itself.
 

BlackSun

Well-Known Member
#6
Congratulations, your website is successful enough that your database couldnt handle the load.
And that was a compliment, and not complaint.

Once the database start working again, I will probably second Dartz, and ask about adding a filter to not show some fandoms.
 
#7
You can already filter by a specific fandom through the <a href='http://fanfictionrecs.net/favs_test.php?full_options=1' target='_blank' rel='nofollow'>advanced search</a>.

You can also improve results by <a href='http://fanfictionrecs.net/suggest_user.php' target='_blank' rel='nofollow'>adding people</a> to the database.

Also... the database died? I guess I was asleep when that happened... hmm...

I'll have to talk to the support staff and see if there's anything I can do.

EDIT:

I disabled the stats page to help reduce load the the database...
 
#8
I'm quite interested in this tool because I was planning something similar. I think the major problem with fanfiction is that there's too much data and too little organization of it. 90% of fanfiction out there might be chaff to you, and it takes lots of effort to sort out the 10% of quality.

I had a different approach though, using ratings instead of relations. I planned on creating a database of stories by spidering FF.net, then letting people register accounts and vote on them, allowing the stories to be filtered by average rating. You're way of doing things might make more sense for how personal the appeal of fanfiction can be, but I think my approach might also be useful. "I like story X, what else is like it?" it offers "I want to read a story of <Series> with <Character>, what are considered the best ones by the community at large?"

I imagine the #1 reason fanfiction sites don't have such a feature naturally is that they don't want to hurt people's feelings, or create drama over rankings.

Although I'm just an amateur in databases and coding, so I planned to start working on it in a scaled-back way as practice, but your work here is quite inspiring to me.

(Now, to add my Favorites list...!)
 
#9
The reason I didn't go that route for my first run through was that it only provides decent results for the mass of readers. Nothing that's niche will play very well in a purely rated system... what if you like Jiraiya/Iruka slash for Naruto?

With back-referencing, all you have to do is find one story like that, and it will look for others who share that taste, who are likely to have other stories in their favorites that share that taste.

That said, my system gives equal weight to all picks. People obviously can have stories that are more favorite than others. That's why I was actually building in a rating system as well, though I'm not sure when it will launch.

The third phase of the site will be to allow people to register accounts and "attach" their FF.net profiles to it. After I do that, I'll be able to start adding other sites, like AFF.net, MediaMiner, MuggleNet, etc. to the database, and I'll be able to leverage favorites on one site to find stories on another.

That's probably several months away though.
 

hchan1

Well-Known Member
#10
It's a great idea, similar to the 'Recommended for You' category that services like YouTube and Steam provide. I've only been playing around with your search for half an hour or so, but plugging in my favorite stories pings a disturbingly large amount of my other favorites as well, so it's definitely a good thing that you can exclude your own profile. It seems to be fairly accurate as well, returning relevant results from the various genres I tried for individual fandoms, although I haven't tried any relatively obscure stories yet.

I was curious about one thing though: is there a way to see how many "hits" a result has? By hit I mean number of times that story shows up on a favorite list. Since new profiles are added manually, the results will probably be wonky due to the small pool of profiles. I'm already noticing stories in the top 10 of some of my queries that have absolutely no place being there due to their utter terribleness, so I'm guessing that those are single-hit entries that by sheer randomness ended up in higher positions.

Sidenote, deliberately plugging in IDs of terribad/troll fics out of morbid curiosity is a bad idea unless you really want to see the cesspool of fanfiction. Just... ugh.
 
#12
JordanL said:
The reason I didn't go that route for my first run through was that it only provides decent results for the mass of readers. Nothing that's niche will play very well in a purely rated system... what if you like Jiraiya/Iruka slash for Naruto?

With back-referencing, all you have to do is find one story like that, and it will look for others who share that taste, who are likely to have other stories in their favorites that share that taste.

That said, my system gives equal weight to all picks. People obviously can have stories that are more favorite than others. That's why I was actually building in a rating system as well, though I'm not sure when it will launch.

The third phase of the site will be to allow people to register accounts and "attach" their FF.net profiles to it. After I do that, I'll be able to start adding other sites, like AFF.net, MediaMiner, MuggleNet, etc. to the database, and I'll be able to leverage favorites on one site to find stories on another.

That's probably several months away though.
Those planned features sound really nice. Regarding which other sites to add, that reminds me of Raimond Eisele's <a href='http://www.fanfictiondownloader.net/contact.php' target='_blank' rel='nofollow'>FanFictionDownloader</a>* which also started with just ff.net and expanded onto different sites. Its currently supported sites are:

<a href='http://www.FanFiction.Net' target='_blank' rel='nofollow'>http://www.FanFiction.Net</a>
<a href='http://www.FanFicAuthors.Net' target='_blank' rel='nofollow'>http://www.FanFicAuthors.Net</a>
<a href='http://www.squidge.org/~peja' target='_blank' rel='nofollow'>http://www.squidge.org/~peja</a> The Wonderful World of Makebelieve
<a href='http://www.Mugglenet.com' target='_blank' rel='nofollow'>http://www.Mugglenet.com</a>
<a href='http://www.AdultFanFiction.net' target='_blank' rel='nofollow'>http://www.AdultFanFiction.net</a>
<a href='http://www.FictionPress.com' target='_blank' rel='nofollow'>http://www.FictionPress.com</a>

* An equally fundamental tool, especially to people like me who read fanfic on their eReader!
 
#13
linger said:
JordanL said:
The reason I didn't go that route for my first run through was that it only provides decent results for the mass of readers. Nothing that's niche will play very well in a purely rated system... what if you like Jiraiya/Iruka slash for Naruto?

With back-referencing, all you have to do is find one story like that, and it will look for others who share that taste, who are likely to have other stories in their favorites that share that taste.

That said, my system gives equal weight to all picks. People obviously can have stories that are more favorite than others. That's why I was actually building in a rating system as well, though I'm not sure when it will launch.

The third phase of the site will be to allow people to register accounts and "attach" their FF.net profiles to it. After I do that, I'll be able to start adding other sites, like AFF.net, MediaMiner, MuggleNet, etc. to the database, and I'll be able to leverage favorites on one site to find stories on another.

That's probably several months away though.
Those planned features sound really nice. Regarding which other sites to add, that reminds me of Raimond Eisele's <a href='http://www.fanfictiondownloader.net/contact.php' target='_blank' rel='nofollow'>FanFictionDownloader</a>* which also started with just ff.net and expanded onto different sites. Its currently supported sites are:

<a href='http://www.FanFiction.Net' target='_blank' rel='nofollow'>http://www.FanFiction.Net</a>
<a href='http://www.FanFicAuthors.Net' target='_blank' rel='nofollow'>http://www.FanFicAuthors.Net</a>
<a href='http://www.squidge.org/~peja' target='_blank' rel='nofollow'>http://www.squidge.org/~peja</a> The Wonderful World of Makebelieve
<a href='http://www.Mugglenet.com' target='_blank' rel='nofollow'>http://www.Mugglenet.com</a>
<a href='http://www.AdultFanFiction.net' target='_blank' rel='nofollow'>http://www.AdultFanFiction.net</a>
<a href='http://www.FictionPress.com' target='_blank' rel='nofollow'>http://www.FictionPress.com</a>

* An equally fundamental tool, especially to people like me who read fanfic on their eReader!
All results can be downloaded in an ePub via a link next to the title, or an arbitrary story can be downloaded via a story ID on the right hand bar.

I wanted to integrate ebooks. :)

Also, side note, the index now includes 148,000 user profiles, 14.8 million favorites and 2.6 million stories.

EDIT: Also, thank you for the donation hchan1. I assume it was you. :)
 

hchan1

Well-Known Member
#14
JordanL said:
You mean how many total favorites lists a given story is on?
Yeah, that's what I meant. And you're welcome for the donation; saw the total was close to the monthly goal and figured I'd do the equivalent of buying you a beer for all the hard work. :p
 
#15
I had that information as part of the results, but it was creating overhead that was unnecessary so I removed it to improve the search speed.

Currently the story with the most favorites in my list has 2540 favorites that I'm aware of.

Over 770,000 stories have only 1 favorite.

The average favorites per story is around 7.
 
Top