January 5th, 2009

Anomos @ 25C3

25C3 was amazing! Thanks to everybody who presented and attended! Especially big thanks to the German Privacy Foundation and the i2p team.




The video of our lightning talk is embedded above. You might not be able to see the slides in the video (that’s what the interrupt is about) so if you want to see the slidesyou can see them here.

Rich

December 25th, 2008

Going to 25C3; Der Heutigen Stasi

Tomorrow I’m leaving for Berlin, Germany for the 25th Chaos Computer Congress! I’ve never been to Germany and can only speak a very few choice phrases (”Ein bier, bitter!”), so I’m very excited for badly communicated German adventures!
25C3
Me and John will have an Anomos table set up at some point during, and we will hopefully be giving a lightning talk as well, so come see us! Also, stop by and talk to us or send me an email if you’re going to be there and we can go out for drinks and schnitzel! I’m looking forward to meeting some of the I2P team and some other leet Germans and would love to hang out with you as well.

On a more somber note:
This will be the last time that I will ever be able to re-enter the US without giving the feds (DHS: Der Heutigen Statsi) a lot of my biometric data. On January 18th, all non-US citizens will be photographed and fingerprinted during re-entry to the United States under the VISIT (Visitor and Immigrant Status Indicator Technology) program. Even though I’m Americanized as apple pie and police brutality, I’m actually a UK citizen, so this means I am to be fingerprinted, photographed and background checked whenever I leave and reenter the country. “Papers please.”

If you citizens think this is okay, will you still think it make sense when they start scanning you when you try to leave or enter your own country? This is the first step. If we keep putting up with this now, it only makes it easier for them to expand deeper into our lives later.

The theme of the conference is ‘Nothing to Hide’ and it couldn’t be more appropriate.
R

November 27th, 2008

The Future of Free Culture

A few weeks ago, I went to the Free Culture 2008 Conference in sunny Berkeley, California. The conference lasted two days. The first was for keynote presentations, and the second was for the ‘unconference,’ a self-organized gathering about key issues for the organization. The result is that Students for Free Culture has finally solidified its goals and has a roadmap for changing our college campuses into Open Universities.

The first day was quite good, lots of interesting talks from big players like Lessig, and I got to chat with some really interesting people (Ron Paul’s campaign manager?! That might need its own post…). The day was followed by a night of after-partying with awful music and even worse dancing.
Berkeley is Lovely
The second day was much more interesting.

To give you some background about my background, I’ve been involved with Students for Free Culture for a few years now. I got the Boston University chapter off the ground and I’ve been involved with FC-related activities throughout Boston/Cambridge and on the internet. However, I’ve always been rather disappointed in the organization. It doesn’t do anything! It doesn’t stand for anything! The critical portait of an organization composed of the geek-chic sitting around with their iPhones, Twittering away their privacy and whining about the RIAA sadly isn’t too far from the truth.

My pal Tim Hwang had posted a similar criticism on his site:

What I’m trying to say isn’t anything like that Free Culture hasn’t been doing anything worthwhile. It’s just stalled on the national level as the times have changed. […] In promoting widespread action, staying at the forefront as technological issues spill outwards into different innovation communities, and taking aggressive and coordinated public action — the national organization as a whole has been quiet.

There are already so many organizations like Creative Commons and the EFF that are working for the things that Students for Free Culture want, and they have more time and resources than us. We are an organization which exists for the sole purpose of saying “We agree!” The biggest failure in my view was when the SFC failed to do anything prevent that passing of the Campus-Based Digital Theft Prevention Act, a corrupt, bullshit piece of legislation that essentially gives big media companies some control over college networks.

So we have decided to change our tactics. We are still in agreement with other digitally progressive organizations, but now we have our own agenda.

The largest theme of the unconference was open access in education. We are interested in copyright reform, of course, but we can’t do anything as students. However, we can work for changes in academia, the area where we do have some influence.
Boston University Free Culture
For instance, our main project at Boston University is trying to start an OpenCourseWare platform. I thought we were the only students trying to start one from the ground up, but Kevin Donnovan is trying the same thing at Georgetown, so we got to share notes. I also met Zac McCune, who is majoring in Hipster Studies at Brown. He’s doing an experiment in wikifying himself, which includes all of his course notes. We talked about starting a new OpenCourseWare project, OpenCourseNotes. I made a mock-up site but I’m still looking for a parter or two to help out with content management; I simply don’t have enough time for the projects on my plate as it is, yet alone with this, but I’d still like to do it if other people are willing to help out. (If you’ve got time and interested and a some skills, drop a comment or send me an email!)

Ultimately, all the congregates reconvened at the end of the day to have the big discussion about our flaws as an organization and what we should do about it. The result was what has come to be known as the Wheeler Declaration, the 5 points that Students for Free Culture stand for, things that we can fight for on our own campuses. These are the five points of the Wheeler Declaration, things that define an Open University:

The research the university produces is open access.

This means not publishing in journals which require expensive subscriptions, but in journals which allow access to all who want to read them. This is very important for curious minds, for science and for business. More information on this is available at The Public Library of Science.

The course materials are open educational resources.

This means that professors and students have a place to share their educational works under open licenses. The best example of this is MIT’s OpenCourseWare, but there are plenty of others.

The university embraces free software and open standards.

This means not forcing students to use proprietary software if there are Free alternatives and allowing compatibility with open standards for documents. More information on this can be found over at the Free Software Foundation.

If the university holds patents, it readily licenses them for free software, essential medicines, and the public good.

This means that the university does not place restrictions on the manufacture of generic drugs for the thirld world or prevent open source software developers from coding by using patent enforcement. Universities Allied for Essential Medicines has more on this as well.

The university network reflects the open nature of the internet.

This one might seem the most simple (just be a neutral ISP, don’t spy on us or filter our traffic), but it might be the hardest because of the law I mentioned earlier. The law doesn’t require binding action and may not come into play. It’s a wait-and-see issue with the new administration coming into the White House and I think that SFC should make it a point of contention early on. As Lessig said, we should be picking some fights and trying to snatch some of the low hanging fruit.

So, now we all have things that we can be working for on our own college campuses. We need to make a lot of noise and do a lot of nagging at the bureaucracy. We might have to set up our own servers and supply our own bandwidth to get the ball rolling, but we can’t sit on our asses anymore! We have specific goals and we can all help each other in working to achieve them.

I’m excited and you should be too!
Hope to see you all again next year,
Rich

PS: More photos on my Flickr and those tagged with #fc2008

November 10th, 2008

TV-Links Raid Follow-Up

Just over a year ago, the streaming-video site TV-Links.co.uk was shut down and the owner and some of the staff were arrested.
htc.png
Now, the site has relaunched as TV-Links.ws and staff member Martin gave me some information about what has happened since the raid.

What has happened since your arrest? What crimes have you and the other staff been charged with? Have you come to any plea agreements or anything like that?

I have had no further visits or contact with the Police since my questioning. I can only assume that we broke no laws since we have not been charged with anything.

Things never went that far, I was questioned, released and that is all, I didn’t need to make any plea agreement. Sin was also released without charge.

Will there be a trial?

Again no, having been released without charge I can only assume that there will be no trial.

Are you afraid of facing another arrest?

Yes, however I believe that the authorities would be better off spending their money and time on more important things where people are actually suffering as a result of real life problems, and spend less time been goons for the rich and powerful.

Why are you still involved with tv-links after your arrest?

Because it’s a hobby for me, I enjoy taking part in a community of people that share my interests in media.

Is tv-links.ws a business venture?

It can’t be called a business in my opinion, businesses make profits and their staff get a pay cheque every week. I’ve never made a cent out of what I do, and don’t want to either. Tv-Links is not a comercial entity, any money thats made from ads goes directly back into the site, to pay hosting bills etc, and any extra is invested in improving the site for its users.

What precautions are you taking to prevent another repeat incident?

We are now located in Sweden, the site is completely legal according to Swedish law as we do not host content. Linking to other sites is not illegal (which is in effect what we do). We will of course now and always fully abide by the law of our country which is Sweden.
We wont go into details on the precautions we have taken other than that we have worked closely together with our lawyers to ensure the legality of what we do and that there are measures in place protecting us in case anyone tries to challenge that.

Where do you see the project in 3 to 5 years?

It’s hard to say really, it’s a hostile market and difficult to predict any changes in law, licensing and demand. If the market is there, we’ll be there.

I’d also like to take this time to say that although other tv-links clone sites exist, they are nothing to do with us. Many of them use heavy handed advertising methods to monetize their sites for profit, which we do not condone at all.

It’s good to here that the boys there have been staying out of trouble and that there was no punishment.

In the year since the arrest, streaming video has gone legitimate with sites like Hulu.com and major networks like CBS and ABC launching their own, ad-supported streaming video services.

The next year will be interesting. Hobbyist sites like TV-Links typically have the advantage in sheer quantity and availability of shows (and they don’t have ads), but the legitimate sites have the advantage in video quality and reliability. Hulu is deliberately keeping the amount of available content low, presumably to push new TV shows and DVD sales of old shows, so there may always be a place for sites like TV-Links, especially as more foreign MegaVideo clones pop-up to provide the bandwidth.

There has also been recent research into streaming torrents, so a third alternative may arise which could combine the archival depth of TV-Links and the Pirate Bay with the convenience and quality of Hulu. The internet needs to sort out a way for me to watch every possible Star Trek episode in high-quality whenever I want. This is the challenge, internet, you have exactly one year!

Rich out.

October 17th, 2008

Mother3 Fan Translation

HELL YEAH. I know this site isn’t about video games but I don’t care, I’m so excited about this amazing hack I’m writing about it here anyway.


pencil.png

The Mother3 Fan Translation is now available.

Mother3 is the sequel to the Super Nintendo game EarthBound, which is my favorite game of all time. Mother3 came out for the GBA, but only in Japan, so some dedicated fans have been working really tirelessly on hacking the ROM to make it playable for us English speaking folks. This is a really tough hack, way more complicated than just changing out a script as you might think. I’ve been following the development progress on the dev blog quite religiously for some time now and it really shows the magnitude of the work that went in to creating this.

Thanks and congratulations, Tomato and Jeffman!

I don’t think I’m going to sleep until I play through this game now. See you in a few days.
Rich

October 8th, 2008

Going to Free Culture Conference 2008!

fc_title_trans.png

I’m going to Free Culture 2008 in Berkeley, California on Friday! I’m excited to meet all of the free culture people and see all of the speakers. Hopefully I’ll be giving a talk/workshop thing on the second day, not quite sure how they’re going to set it up, though, so we’ll see how that goes..

Anyway, if you’re going to be there you should come and see me! I’d love to talk to you about OpenCourseWare or applied cryptography or anything else.

I’ll be sure to put up some pictures of all the FC kiddies getting hyphy up in the Yay Area, of course.

See you there!
Rich

September 19th, 2008

Anonarchy

Anonarchy

There has been a lot of talk about cryptoanarchy, but zero talk about the other side of the equation, anonymity in government. I dub this Anonarchy, and have made a logo accordingly (actually, I made the logo first and the system of government to fit it.)

If you’re not a computer-science person, this wikipedia article might explain some things. In short, A* is the algorithm which is used to find the best paths between anonymous relayers in mixnets like Tor.

There are pros and cons to this type of government. One one hand, it is a completely pure type of politics. Supposing this is a representative democratic anonarchy, all potential leaders are anonymous and therefore judged solely on their stance on issues and their speeches, not by their age, sex, skin color, history, etc. It is also corruption-proof as the bribee is unknown, and the result of the bribe cannot be verified anyway.

On the other hand, 4Chan.

That is all.
Rich.

September 10th, 2008

Hacking Mac Kiosks

You’ve probably seen a kiosk at some point in your life. They’re the standalone computers you see in malls and lobbies all over the place. They’re typically just a browser and some software to stop you from doing anything else on the computer. fullkiosk.jpgThis makes me sad. All that computing power just being used for reading email? These computers are yearning to unleashing their full potential. Plus, as we shall see, these computers are inherently untrustable, so you may need to get out of kiosk mode to make sure you aren’t being keylogged, or so you can install Firefox and Tor and browse anonymously, or you may just need terminal access.

A recent presentation at DefCon announced the release of iKat a suite of tools for experimenting with kiosk browsers. Before this, 0×000000 had some really good scripts for breaking broswers with javascript and the like. Unfortunately, most of these tools are geared towards Windows kiosks, and the ones near me run OSX. So.

I get an hour lunch break. I am bored. But, I have enough time to figure out how to crash our local macs running wKiosk, and to blog about how I did it. With time to get a banana and yoghurt.


First, reboot the computer by holding/spamming Command + Control + Eject (Top right button). (If that doesn't do it, run this Flash overflow exploit to crash wKiosk and then spam reboot.)

During startup, hold down shift to boot into safe mode, which should let you pick a user.

As you log in, hold down Command + D to force the dock.

Then, open up finder and terminal and have your fun. wKiosk might still pop up, but if you've got the finder running you should just be able to force it into the background.

wkiosk owned terminal

Tada!

Update!:

This article had an interesting response in the comments here and on reddit. I wrote this article with the intention that somebody at a Mac kiosk would want to use a different program and type “hacking mac kiosks” into google and come here. But some of the readers have been unsatisfied by this as a ‘hack’ as, and I will admit, it is pretty tame as safe-mode doesn’t provide complete access to the operating system, although it will give access to the terminal, which was the point of the article. I left it alone there, but apparently some people need it spelled out for them. There are things called ‘rootkits’ which provide privilege escalation and the rest of the nasty goodness you might require. This is veering into script-kiddie territory which I’m not going to talk about explicitly in this post, but this is why consider physical access to terminal on any machine the same as ‘owned.’ It’s only one simple step farther.

Also!: Paul Craig, author of iKat, posted a comment down below, which is really cool. He’s promising a new version of iKat in the next few months with some more non-Windows specific sploits so hopefully we can just skip the emokiosking listed above. Will update when that happens!

Rich

September 1st, 2008

Frankenboob: Don’t Throw Out Old Hard Drives!

Aug 31st is move-out day in Boston, and that means dumpster diving! This years haul: 3 computers (2 P4 Dells and an HP Laptop!), a 20″ monitor, 2 power conditioners, a lot of furniture and the Cryptonomicon!

So, after booting up the computers and a bit of poking around, we found this in Mr. Kyle M’s photos folder..

Frankeboob
Click for big, uncensored Frankenboob.jpg! NSFW, obviously.

People! Never, ever throw away un-wiped hard drives. Especially if you have a monster fetish and hackers living on your block!

Also: I got some exciting news today, but I’ll tell you tomorrow once I get all the details.

Rich out!

August 19th, 2008

Visualizing Natural Language Processing Data and Extracting Conceptual Relationships

This is the final of three posts wrapping up my experiences with Google Summer of Code 2008 and the Singularity Institute for Artificial Intelligence. (The other two are here and here.)

In this post, I will show how to visualize NLP parsed data using Wikipedia as an example.

As I mentioned in the previous post, the RelEx Crawler can output a HyperGraphDB.

To view the HyperGraph, we will be using the HGViewer package of HyperGraphDB. The version in the source repository when I downloaded it was out of date and broken, so I updated the code so it would compile. You can download that compiled JAR file here, and a tarball of the updated source here.

To use the viewer, we will be using the Scriba environment. Scriba is made by Kobrix Software, who also make HyperGraphDB, so the two integrate very well (in fact, Scriba relies on HGDB). Scriba is a ‘notebook environment’ which allows internal Java scripting with BeanShell, letting us execute sections of Java code inside the document on the fly, which is really quite handy. It’s still buggy and can be a little finicky to set up but it does what we need it to.

After using the Crawler to parse a section of the web, you will have a folder with the starting the first url. Get the HGEnvironment and load this into Scriba and start visualizing. If you want a demo, play around with this Scriba document and this small relation graph.

Because Scriba allows in-document scripting, you create a Crawler, give it a target, crawl, parse, get the resulting RelationGraph and visualize it all in the same workspace!

Have a look at some pictures beyond the jump..
Keep reading →

August 17th, 2008

RelEx Crawler: Natural Language Processing Web Crawler and HyperGraphDB Manager

This is the second of three posts wrapping up my experiences with Google Summer of Code 2008 and the Singularity Institute for Artificial Intelligence.

The RelEx Crawler was the heart of my project. The summary proposal is here on Google’s site and my full project proposal is here on the OpenCog wiki.

Given an input URL and a number of pages to crawl, the crawler will run the text of the page through the RelEx semantic relationship extractor and output the data into a variety of formats before moving on to to the next page. The SIAI people wanted an NLP crawler based on the popular ‘Nutch’ crawler, but unfortunately the Nutch Java client was functionally useless, so there was no good way of combining the two products. As a result, this crawler is new (though based very heavily on a guide put out by Sun).

The signal to noise ratio is a problem when using the entire web as corpus, so this project was designed with specific knowledge bases in mind. Most of my test cases were done using Wikipedia, and there are also a few Wikipedia-specific tweaks to stop the crawler from getting out into the open web from the external links, from crawling edit history and user pages, things like that. Wikipedia also has the advantage of being available under the GNU Free Document License, meaning that it can be freely redistributed to avoid any copyright unpleasantness which might arise from redistributing semantically-parsed material from other, non-free knowledge bases. I can’t imagine Microsoft would be too happy if you were freely redistributing a processed version of their Microsoft Developer’s Network!

The crawler can output five different formats: simple relations, RelXML, openCogXML, Relex Compact Output, the ARC archival format and HyperGraphDB.

The Relex Compact Output is a new XML-based format created specifically for this project. The problem with crawling and parsing large amounts of data is that the annotation markup blows up the space requirement by at least a factor of 15, which can mean storage requirements in the hundreds of gigabytes for large corpii, so the compact output tries to minimize the size of output data. There is also a need to include other relevant information such as date parsed and the versions of the software used. This format does this, and, coupled with the gzipping provided by ARC, minimizes the output file size.

The other interesting output is the HyperGraphDB output. For more information on hypergraphs, check out the wikipedia article and for information on HyperGraphDB, check out the developing company’s webpage.

The crawler can output a specific type of HyperGraph which I’ve called a RelationGraph. There are two types of value links in a relation graph, PropertyLinks and RelationLinks. Using this RelationGraph, we can store and query semantic and grammatical relationships between concepts, even those which appear on different web pages. This also lets us visualize NLP data using HyperGraphDB viewer, which I’ll go into further in the next post.

hyperpedia.jpg

I’d like to take some time here to thank Linas Vepstas, who is the undisputed king of natural language processing in AI! I would be nowhere without his help and his hard work. Linas, thanks ever so much for everything you’ve done for me. I’ve got some other people to thank too, but I’ll get to that in the third post.

The crawler source (in Java) is located here, along with the compile script. Just dump /crawler/ into your RelEx source path and change build.xml. You’ll probably have to get the HTMLParser library to properly strip the HTML tags from the text. I also used Scott Piao’s Sentence Detector, but I don’t think it’s actually necessary any more.

If you’d like to play with a RelationGraph, here is one containing 3 parsed pages Simple English Wikipedia, starting at The Battle of Hastings.

Stay tuned for the third post in this series, “Visualizing Natural Language Processing Data and Extracting Conceptual Relationships”!

Rich!

August 15th, 2008

Linkovbot: A Natural Language Processing IRC Bot

This is the first of three posts wrapping up my experiences with Google Summer of Code 2008 and the Singularity Institute for Artificial Intelligence.

Although it had nothing to do with my main project proposal, Linkovbot does use the same NLP software and I did write it to give myself a break from reading too much about graph theory, so that counts..right? It was just a bit of fun, really.

Anyway, as the title describes, Linkovbot is a natural language processing IRC bot. It lurks in an IRC channel, reading what people say and developing a corpus. Optionally, it will only add sentences to the corpus if they are grammatically correct. If you send the bot a private message, it will generate a completely new sentence which will also be grammatically correct. The sentence generation uses Markov chains to generate the text and then uses RelEx’s parse confidence score to verify that the text is grammatically correct. At first, it will probably only repeat things which have already been said, but as the corpus expands, it begins to say novel and complex things.

When we were first playing with it in the #opencog room, all of the SIAI people immediately began to teach it horrible things. I find it absolutely terrifying that the people working on the world’s best project to achieve Artificial General Intelligence are so willing to teach robots to be evil!

Here are some highlights of the first day of tests. Notice how the sentences become longer and more varied as time passes.

I like robots.
Life is beautiful.
Can we talk to linkovbot directly?
Can we talk to linkovbot directly?
What does that mean?
Mine is for everyone.
I like robots.
It is beautiful.
Kill all humans.
An apple is being too picky with its skin.
A ball is beautiful.
It is dead.
It is beautiful.
An apple is beautiful.
Kill all humans.
Save the skin.
Maybe they do not work.
All apples are markov chains from the world.
Human flesh is for everyone.
Kill all humans.
All apples are red.
All apples are red skin.
Save the world.
Are markov chains from the corpus.
It keeps telling me life is only a kitten.
Gmail is being too picky with its sentences.
Total anarchy.
Life is for everyone.
Not all balls are red.
2040 will be a serious chat room.
Human flesh is the corrupted humans.
It generates markov chains from the world.
Sometimes their cache also does not work.
Followers of Mammon cowered in linkov.
I think you might be evil.
Save the internal representation.
First language generation is beautiful.
Time flies like apples are intertwined by definition.
Human flesh is dead.
Kill a strange sentence.
Please Let Knight Rider Die In Peace.
Fun is a warning.
It certainly was that woman.
Kill all humans.
Randomly just need panadol.
The old obsessions are humans.

Of course, the sentence generation is very, very crude. While these are all technically grammatically correct (at least according to the RelEx confidence score, which isn’t always entirely accurate), they aren’t necessarily sensical. The Markov generation originally came from a python script by John Schanck, Python Ninja which is based on another one by Greg McFarlane, which is based on the original idea and code by Bruce Ellis. John wrote his script to see if computers could rap if you trained them on the complete works of “Notorious” Christopher George “Big Poppa” Latore Wallace. Which is a brilliant idea, producing such gems as:

Yeah, i see you
what’s beef?
Beef is?
we ain’t no cryin at the wake up
hoodie to sleep
beef is deadly

The next step would be to cross with a rhyming dictionary and a syllabic counter, if anybody is up for it! Perhaps we could start the first “human competitive” AI rap battle! A bit like the humies, only for freestyling instead of scientific achievement. This needs to be investigated further.

I’ve wondered off topic. Back to sentence generation.

I’ve considered things like seeding the Markov chains to the subject of the input sentence, but I’d really much rather come up with an entirely different, more intelligent way of generating new sentences. This pertains to the main part of my project, so I will write more about this in the third post.

The IRC aspect of the bot is done in Python using python-IRCLib. It’s very simple, so it might be a good example if you are just starting to write your own IRC chat bot. Take a look at the source here (linkov.pys).

Although this was a just-for-fun project, it’s only a 1-line change in ‘relex.sh’ to turn this into an NLP research bot which saves NLP parses of IRC chats.

The whole package with RelEx included is available here (linkovbot.tar.gz). To run it, you’ll need to make sure you have all the necessary dependencies, namely Link-Grammar, which you can read about here.

Stay tuned for the next two posts in this series, “RelEx Crawler: Natural Language Processing Web Crawler and HyperGraphDB Manager” and “Visualizing Natural Language Processing Data and Extracting Conceptual Relationships”!

I’ll leave you with a final, chilling thought from one of the other OpenCogers when we noticed how scary the robot was acting:

Systems without goals adopt the goal of evil.

We’re fucked!
Rich

Complete Archives

Complete archives are available here.