Posts Tagged ‘Twitter client’

h1

Creating a Twitter client in Erlang

December 1, 2008

So I’ve been playing around with Erlang for a little while now and decided to create a small client for one of my favourite mini blogging services (Twitter), which I’ve decided to share with all, both to help other noobs and for myself a personal reference.

So we want to be able to retrieve a REST like response from twitter, where do we start? First thing was to start up erl from the command lineerl -pa ~/erlware/packages/5.6.3/lib/*/ebin "$@"

I had installed erlware to my local directory & linked it into erl by running the above command. Now that we are in erl, I started inetsinets:start().

Now we are ready to start playing with http:response, I want to be able to retrieve the HTTP body response and do some XML parsing with it http:request("http://twitter.com/statuses/public_timeline.rss")
which returned me something simular to
{ok,{{"HTTP/1.1",200,"OK"},
[{"cache-control","max-age=1800"},
{"connection","close"},
{"date","Mon, 01 Dec 2008 10:46:06 GMT"},
{"server","hi"},
{"content-encoding","UTF-8"},
{"content-length","9251"},
{"content-type","application/xml"},
{"expires","Mon, 01 Dec 2008 11:16:07 GMT"},
{"set-cookie",
"JSESSIONID=2C9CFD77A4EF365B46884168D089FD4A; Path=/"}],
[60,63,120,109,108,32,118,101,114,115,105,111,110,61,34,49,
46,48,34,32,101,110,99,111|...]}}

With a little digging, we come to find out that the following pattern matches pretty nicely{ok,{_Status,_Head,Body}} = http:request("http://twitter.com/statuses/public_timeline.rss")
Typing Body. in the erlang shell returns a list simular to the following:[60,63,120,109,108,32,118,101,114,115,105,111,110,61,34,49,
46,48,34,32,101,110,99,111,100,105,110,103,61|...]

Now this is our HTTP body response, now we can fire up emacs or your favourite text editor and create our first method. -modules(twitterl).
-compile(export_all).

retrieve_response(Url) ->
{ok,{_Status,_Head,Body}} = http:request(Url),
Body.

Make sure you save this file with the same name as the module (or Erlang will complain) & make sure the file is in the same directory that erl is running in. c(twitterl).
Will compile our piece of code. Now we should have the following command availabletwitterl:retrieve_response("http://twitter.com/statuses/public_timeline.rss").
Which will give u the same result as the previous commands passed to erl. Now the Body term still has our HTTP body so well play with this for the time being, if not then just do the following:Body = twitterl:retrieve_response("http://twitter.com/statuses/public_timeline.rss").
Now for the next piece, we actualy want to parse this body and retrieve all its descriptions (which stores our actual twitters).xmerl_scan:string(Body) Give us a result simular to the following:xmerl_scan:string(Body).
{{xmlElement,rss,rss,[],
{xmlNamespace,[],[]},
[],1,
[{xmlAttribute,version,[],[],[],[],1,[],"2.0",false}],
[{xmlText,[{rss,1}],1,[],"\n ",text},
{xmlElement,channel,channel,[],
{xmlNamespace,[],[]},
[{rss,1}],
2,[],
[{xmlText,[{channel,2},{rss,1}],1,[],"\n ",text},
{xmlElement,title,title,[],{xmlNamespace,...},[...],...},
{xmlText,[{channel,2},{rss,...}],3,[],[...],...},
{xmlElement,link,link,[],...},
{xmlText,[{...}|...],5,...},
{xmlElement,description,...},
{xmlText,...},
{...}|...],
[],"/home/baphled/projects/erlang/status_managerl",
undeclared},
{xmlText,[{rss,1}],3,[],"\n ",text}],
[],"/home/baphled/projects/erlang/status_managerl",
undeclared},
[]}
This returns our XML and the rest of the HTTP body, we only want the XML so{Xml, _Rest} = xmerl_scan:string(Body) Will store our XML within the Xml term. Now we have the data we want to parse, we can use the following command to retrieve a list of tuples, containing our descriptions in XML format simular to the following.[{xmlText,[{description,4},{item,12},{channel,2},{rss,1}],
1,[],
[101,120,116,101,101,110,114,101,99,101,110,116,58,32,224,
184,161,224,184,178,224,185,129|...],
text},
{xmlText,[{description,4},{item,13},{channel,2},{rss,1}],
1,[],
"ston3monk: wild place where everyday brings somethnig new ",
text},
{xmlText,[{description,4},{item,13},{channel,2},{rss,1}],
2,[],
"& unepected. do i continue on the unknown, or do i go hang out on koh samui ",
text},
{xmlText,[{description,4},{item,13},{channel,2},{rss,1}],
3,[],"& work 4 a while?",text},
{xmlText,[{description,4},{item,14},{channel,2},{rss,1}],
1,[],
"juffrouwjannie: @Hoof is het al latenweoveralweerzoutopgaanleggendag?",
text},
{xmlText,[{description,4},{item,15},{channel,2},{rss,1}],
1,[],
[116,104,97,105,50,104,97,110,100,58,32,91,84,50,72,93,32,
224|...],
text},
{xmlText,[{description,4},{item,16},{channel,2},{rss,1}],
1,[],
[65,111,121,97,89,117,121,58,32,227,129,138,232,143,147,229,
173|...],
text},
{xmlText,[{description,4},{item,17},{channel,2},{rss,1}],
1,[],
[101,99,108,117,99,105,102,101,114,58,32,64,105,99,104,105|...],
text},
{xmlText,[{description,4},{item,18},{channel,2},{rss,1}],
1,[],
"simonandre: @fabienthomas envoie leur un mail ils sont super r\303\251actifs.. je l'avais fait y'a un an, dans les 48h \303\247a avait \303\251t\303\251 corrig\303\251",
text},
{xmlText,[{description,4},{item,19},{channel,2},{rss,1}],
1,[],
"debugz: @draconiams I saw some male chicks from here today,they are hott too =P",
text},
{xmlText,[{description,4},{item,20},{channel,2},{rss,1}],
1,[],
"csinctw1: [JCSAGE] About to build project=mobility status=Success",
text},
{xmlText,[{description,4},{item,21},{channel,2},{rss,1}],
1,[],
[101,108,101,99,116,114,105,99,97,108,80,101|...],
text},
{xmlText,[{description,4},{item,22},{channel,2},{rss,1}],
1,[],
[108,117,110,97,114,121,117,101,58,32,231|...],
text},
{xmlText,[{description,4},{item,23},{channel,2},{rss,1}],
1,[],
[109,49,48,56,58,32,230,172,161,227|...],
text},
{xmlText,[{description,4},{item,24},{channel,2},{rss,1}],
1,[],"DerAlbert: @adventskalender open 1",text},
{xmlText,[{description,4},{item,25},{channel,2},{rss,1}],
1,[],
[100,97,105,95,97,105,114,58|...],
text},
{xmlText,[{description,4},{item,26},{channel,2},{rss,1}],
1,[],
"backfunshop: backfunshop.de : backfun.de ** Bratentopf 28cm 6,7l auch f\303\274uktionsherde http://tinyurl.com/69ezcc",
text},
{xmlText,[{description,4},{item,27},{channel,2},{rss,1}],
1,[],
"drillhalllib: Want the chance to win \302\243100 in Amazon vouchers? Take part in this years Student Satisfaction Survey now! http://moourl.com/dhl08",
text},
{xmlText,[{description,4},{item,28},{channel,2},{rss,1}],
1,[],
"techdigest: Considering moving to a Dvorak keyboard - anyone ever tried it? http://tinyurl.com/9x84j",
text},
{xmlText,[{description,4},{item,29},{channel,2},{rss,1}],
1,[],
"franksting: @stilgherrian huh, whats that about throat infections? Both ms3 (i can say that now) and I have something and we both went swimming on S ...",
text},
{xmlText,[{description,4},{item,30},{channel,2},{rss,1}],
1,[],
[65,108,108|...],
text},
{xmlText,[{description,4},{item,31},{channel,2},{rss,...}],
1,[],
"abcnewsVic: Great Ocean Road reopens after fatal smash: A stretch of the Great Ocean Road that was the scene of a .. http://is.gd/9HAP",
text}]

Now we go back to our source file and create a method for our discovery:latest_twitters() ->
Body = retrieve_response(?PubTimeUrl),
{Xml, _Rest} = xmerl_scan:string(Body),
Twitters = xmerl_xpath:string("//item/description/text()", Xml),
Twitters.

This method will use our previously created method to retrieve our response and then we parse XML searching for the description element within the item element, this then stores each of the enteries in Twitters, which is then returned. Save and recompile the file. We can now use our latest_twitters method to returl a list of our results.Twitters = twitterl:latest_twitters() will store our list to our Twitters term. Now we now we have a list so we will manipulate it as follows to see how we can deal with our results.[H|T] = Twitters. gives us the first term in the list, something simular to the following:{xmlText,[{description,4},{item,12},{channel,2},{rss,1}],
1,[],
[107,105,114,105,108,108,95,121,117,114,105,58,32,228,187,
138,229,185,180,227,129,174,231,180|...],
text}

Lets break this down a little so we can deal with the specific data we are interested in, now our term is a tuple containing an atom, a tuple,another atom,an empty list, followed by our text, proceeded by an atom. We only need the text string so the following pattern will do us.{_,_,_,_,Title,_}Lets give it a try.{_,_,_,_,Title,_} = H.
io:format(Title).
We get a result simular to:kirill_yuri: \344\273\212\345\271\264\343\201\256\347\264\205\347\231\275\343\201\257\347\264\205\343\202\222\345\213\235\343\201\237\343\201\233\343\201\237\343\201\204\351\233\260\345\233\262\346\260\227\343\201\214\345\205\205\346\272\200\343\201\227\343\201\246\343\202\213\346\260\227\343\201\214\343\201\231\343\202\213\343\202\223\343\201\240\343\201\214\343\201\251\343\201\206\343\201\213\343\201\255\343\200\202 The above is our actual twitter, now armed with this piece of information & a little tinkering we can add another method to our source file.loop_twitters([Twit|Twitters]) ->
case is_tuple(Twit) of
true ->
{_,_,_,_,Title,_} = Twit,
io:format("~s~n",[[Title]]),
loop_twitters(lists:reverse(Twitters));
false ->
{error, "Unable to read twitter"}
end;
loop_twitters([]) ->
ok
We’ll need to do a little refactoring to make everything play nice. Namely refactor our latest_twitters method to pass our twitters on to loop twitters. I guess we’ll want to only export one method now also, which in this case will only be the latest_twitters method. This is done by removing the line-compile(export_all). and replacing it with-export([latest_twitters/0]).

Our source file should look like this:
-module(status_managerl).
-compile(export_all).
-export([init/0,stop/0,latest_twitters/0]).
-include_lib("xmerl/include/xmerl.hrl").
-define(PubTimeUrl, "http://twitter.com/statuses/public_timeline.rss").

init() ->
inets:start().

stop() ->
inets:stop().

retrieve_response(Url) ->
{ok,{_Status,_Head,Body}} = http:request(Url),
Body.

latest_twitters() ->
Body = retrieve_response(?PubTimeUrl),
{Xml, _Rest} = xmerl_scan:string(Body),
Twitters = xmerl_xpath:string("//item/description/text()", Xml),
loop_twitters(Twitters).

loop_twitters([Twit|Twitters]) ->
case is_tuple(Twit) of
true ->
{_,_,_,_,Title,_} = Twit,
io:format("~s~n",[[Title]]),
loop_twitters(lists:reverse(Twitters));
false ->
{error, "Unable to read twitter"}
end;
loop_twitters([]) ->
ok.

Now finally we'll compile our source file and run using the following commandsc(twitterl).
twitterl:latest_twitters()

Will produce our list of latest twitters simular to the following:timetoleadfr: @otromundo merci d'avoir relay\303\251 le message :)
bernardoruas: @rejaneayres I hope so.
osa03: \343\202\246\343\203\274\343\200\202\343\203\255\343\202\270\343\203\203\343\202\257\343\201\214\343\203\257\343\202\253\343\203\203\343\202\277\343\203\274\343\200\202
olemi: @DavidFeng: \345\260\215\344\270\215\350\265\267\357\274\214\346\210\221\344\270\215\347\237\245\351\201\223Catch-22\346\230\257\347\224\232\351\272\274\346\204\217\346\200\235.... :P \350\203\275\350\267\237\346\210\221\350\247\243\351\207\213\344\270\200\344\270\213\345\227\216\357\274\237 :)
ijournal: RenRe's WeatherPredict Acquires Assets of Commodity Hedgers http://twurl.nl/pa5urw
sydsvenskan: Man knivr\303\245nad i Malm\303\266:
En man blev under knivhot r\303\245nad strax efter klockan elva i en g\303\245ngtunnel vid .. http://tinyurl.com/6zktzc
prensa: NOUVELOBS: Venise : la mont\303\251e des eaux atteint 1,56 m\303\250tre: La mer commence
à se retirer l.. http://tinyurl.com/5nb6dd
Erik_Visser: Blijft toch een leuk verhaal van die belgische minister op het blog van Nathalie Lubbebakker. http://www.nathalielubbebakker.com/
petulak: @punio miedo me dais, cinematograficamente hablando!!
...

Well that's it for the moment, the code is far from perfect & i'll already started to refactor the code to do more, currently, Im able to search for a user, retrieve their latest tweets (from & to) as well as query terms used ie. #Erlang. Hopefully when I get the time I'll post some code to help others do simular things.

Advertisements