Friday, August 15th, 2008...4:02 pm
Linkovbot: A Natural Language Processing IRC Bot
This is the first of three posts wrapping up my experiences with Google Summer of Code 2008 and the Singularity Institute for Artificial Intelligence.
Although it had nothing to do with my main project proposal, Linkovbot does use the same NLP software and I did write it to give myself a break from reading too much about graph theory, so that counts..right? It was just a bit of fun, really.
Anyway, as the title describes, Linkovbot is a natural language processing IRC bot. It lurks in an IRC channel, reading what people say and developing a corpus. Optionally, it will only add sentences to the corpus if they are grammatically correct. If you send the bot a private message, it will generate a completely new sentence which will also be grammatically correct. The sentence generation uses Markov chains to generate the text and then uses RelEx’s parse confidence score to verify that the text is grammatically correct. At first, it will probably only repeat things which have already been said, but as the corpus expands, it begins to say novel and complex things.
When we were first playing with it in the #opencog room, all of the SIAI people immediately began to teach it horrible things. I find it absolutely terrifying that the people working on the world’s best project to achieve Artificial General Intelligence are so willing to teach robots to be evil!
Here are some highlights of the first day of tests. Notice how the sentences become longer and more varied as time passes.
I like robots.
Life is beautiful.
Can we talk to linkovbot directly?
Can we talk to linkovbot directly?
What does that mean?
Mine is for everyone.
I like robots.
It is beautiful.
Kill all humans.
An apple is being too picky with its skin.
A ball is beautiful.
It is dead.
It is beautiful.
An apple is beautiful.
Kill all humans.
Save the skin.
Maybe they do not work.
All apples are markov chains from the world.
Human flesh is for everyone.
Kill all humans.
All apples are red.
All apples are red skin.
Save the world.
Are markov chains from the corpus.
It keeps telling me life is only a kitten.
Gmail is being too picky with its sentences.
Total anarchy.
Life is for everyone.
Not all balls are red.
2040 will be a serious chat room.
Human flesh is the corrupted humans.
It generates markov chains from the world.
Sometimes their cache also does not work.
Followers of Mammon cowered in linkov.
I think you might be evil.
Save the internal representation.
First language generation is beautiful.
Time flies like apples are intertwined by definition.
Human flesh is dead.
Kill a strange sentence.
Please Let Knight Rider Die In Peace.
Fun is a warning.
It certainly was that woman.
Kill all humans.
Randomly just need panadol.
The old obsessions are humans.
Of course, the sentence generation is very, very crude. While these are all technically grammatically correct (at least according to the RelEx confidence score, which isn’t always entirely accurate), they aren’t necessarily sensical. The Markov generation originally came from a python script by John Schanck, Python Ninja which is based on another one by Greg McFarlane, which is based on the original idea and code by Bruce Ellis. John wrote his script to see if computers could rap if you trained them on the complete works of “Notorious” Christopher George “Big Poppa” Latore Wallace. Which is a brilliant idea, producing such gems as:
Yeah, i see you
what’s beef?
Beef is?
we ain’t no cryin at the wake up
hoodie to sleep
beef is deadly
The next step would be to cross with a rhyming dictionary and a syllabic counter, if anybody is up for it! Perhaps we could start the first “human competitive” AI rap battle! A bit like the humies, only for freestyling instead of scientific achievement. This needs to be investigated further.
I’ve wondered off topic. Back to sentence generation.
I’ve considered things like seeding the Markov chains to the subject of the input sentence, but I’d really much rather come up with an entirely different, more intelligent way of generating new sentences. This pertains to the main part of my project, so I will write more about this in the third post.
The IRC aspect of the bot is done in Python using python-IRCLib. It’s very simple, so it might be a good example if you are just starting to write your own IRC chat bot. Take a look at the source here (linkov.pys).
Although this was a just-for-fun project, it’s only a 1-line change in ‘relex.sh’ to turn this into an NLP research bot which saves NLP parses of IRC chats.
The whole package with RelEx included is available here (linkovbot.tar.gz). To run it, you’ll need to make sure you have all the necessary dependencies, namely Link-Grammar, which you can read about here.
Stay tuned for the next two posts in this series, “RelEx Crawler: Natural Language Processing Web Crawler and HyperGraphDB Manager” and “Visualizing Natural Language Processing Data and Extracting Conceptual Relationships”!
I’ll leave you with a final, chilling thought from one of the other OpenCogers when we noticed how scary the robot was acting:
Systems without goals adopt the goal of evil.
We’re fucked!
Rich
Save This Page! |
5 Comments
August 16th, 2008 at 9:34 am
This is a test comment.
August 16th, 2008 at 11:55 am
Kill all humans.
August 21st, 2008 at 12:45 am
I’m trying to use your irc bot, but python-irclib is silently failing to do anything.
August 22nd, 2008 at 8:28 am
I’m guessing you’re having dependency issues. Can you make RelEx work on its own?
August 27th, 2008 at 9:59 pm
[…] This is the final of three posts wrapping up my experiences with Google Summer of Code 2008 and the Singularity Institute for Artificial Intelligence. (The other two are here and here.) […]
Leave a Reply