I have a problem with app.net. It's the same problem I had with Twitter in the early days. Who am I not following that I should be? In my head, the solution was a simple algorithm:
- look at the list of people I follow
- look at the list of people those people follow
- sum the occurances of each person
- profit
I never got around to writing it into something real for Twitter, but last night between baby feedings and screamings I came up with something for app.net1:
import re, urllib, operator
from collections import defaultdict
USERNAME = raw_input("Get suggestions for: ")
CUTOFF = 3
pattern = re.compile(r'<span class="username"><a href="/\w+">(?P<name>\w+)</a></span>')
suggestions = defaultdict(int)
def find_followers(user):
print "**********"
print "Finding followers for %s" % user
url = "https://alpha.app.net/%s/following/" % user
user_html = urllib.urlopen(url).read()
user_following = re.findall(pattern, user_html)
print "%s follows %d people" % (user, len(user_following))
return user_following
i_follow = find_followers(USERNAME)
n_follow = len(i_follow)
for user in i_follow:
user_following = find_followers(user)
for person in user_following:
if person in i_follow:
continue
else:
suggestions[person] += 1
sorted_suggestions = sorted(suggestions.iteritems(), key=operator.itemgetter(1))
final_suggestions = (item for item in sorted_suggestions if item[1] > CUTOFF)
output = "<html><head></head><body><table border='1'>"
for item in final_suggestions:
percentage = (float(item[1])/n_follow) * 100
output += "<tr><td><a href='https://alpha.app.net/%s/'>%s</td><td>%.1f%%</td></tr>\n" % (item[0], item[0], percentage)
output += "</table></body></html>"
with open("adn_suggestions.html", "w") as f:
f.write(output)
print "\nAll done. Your suggestions are in adn_suggestions.html"
Yes, it's a total html scraping hack. Yes, someone could do a far cleaner and better job using the official API and some CSS. Someone is more than welcome to take this as a starting point and do just that. But I have six month old twins to take care of and besides, that just wouldn't be my style.
-
Also available in gist form: https://gist.github.com/3884563 ↩
Comments !