In the last couple of days, we’ve been working on a migration to Slicehost from GoDaddy. GoDaddy hosting served us well so far. We got greedy and wanted more.
Setting up the slice was rather painless, with a full control over everything we wanted to do with our slice. Plus, documentation is pretty good on the SliceHost world.
I would strongly recommend Slicehost, however keep in mind that, their hosting services pretty much “built for developers”. If you’re not comfortable with Linux flavours, things may get hairy. Be warned.
In case you want a slice on Slicehost, and need help, help is right here.
If you don’t want to deal with all the server stuff, TMDHosting, SiteGround, GoDaddy are very sound options for your hosting needs.
As stupid as it looks, and it “does NOT make any sense” at many angles, NewsXperiment bears a few interesting software technologies and paradigms.
NewsXperiment project consists of two parts: NewsXperiment Scrambler Engine (NSE), and Web frontend.
NewsXperiment Scrambler Engine runs offline and gathers, processes, scrambles and outputs a zip file that consists of scrambled news item pickles.
Once executed, NSE goes through its categorized feed repository and retrieves the feeds. Thanks to Mark Pilgrim’s excellent “feedparser” library.
Now that the feeds are read, the engine performs the following:
randomly picks a certain number of news items from each category as base feeds.
randomly associates a certain number of scrambler feeds to each base feed.
At this point, the engine has the initial data in place. There comes the scrambling…. However, before scrambling anything, all the entries picked to be scrambled need to be tagged, chunked, chinked.
Using NLTK, all the titles, and summaries read are tagged, chunked, chinked.(i love this part)
Accoding to the chunkie, chinckie data, each base feed item’s title and summary are scrambled with the set that was destined to be the scrambler for the base. Ofcourse, this does not always result in a well-constructed sentence.
At some point, the scrambling process is completed and time to generate the output file.
Output file is created out of each scrambled item, and consists of a list of titles, summaries and links back to the news items that are used to create them. This file is a pickle dump dictionary elements.
The output file is datestamped, and zipped. Zip file because, doh!, it’s compressed. Plus, I couldn’t find a way around uploading the pickle content to Google AppEngine. Very likely a MIME type issue, but didn’t dig deep into that. A zipped pickle dump was all I needed, and I had it.
Very well, I have the zipped pickles, what do I do with them? If I cannot get them up to Google AppEngine’s data store, how possibly could I share ?
I started playing with Google AppEngine a few months ago. First, tried to port over my work in SillyDomainNames.com into AppEngine, but gave up on it after a short while. Always being on the lookout for new and interesting ideas, I somehow came up with this experimental-mash-up-site concept; NewsXperiment. Not your everyday mashup site, something different and unique. I spent some time experimenting with the code locally and after I brought the Natural Language Toolkit (NLTK) into the mix, it immediately gained some traction. Brilaps was looking for a project to test out the new Google AppEngine and it made sense to let NewsXperiment.com be the guinea pig. Google AppEngine turned out to be a great idea and it didn’t take long for me to bring this project from an idea to the first beta release. I haven’t come across anything similar yet, so if anything exists, please let me/them know.
So what is NewsXperiment? What can I do there? What is the roadmap and what were the challenges during development? I’ll try to answer those questions in this blog post.
What is NewsXperiment?
NewsXperiment is a news scrambler/generator site. In the possible simplest terms, NewsXperiment reads a bunch of RSS feeds, approximately 200, from a number of highly respected sources and scrambles their news’ titles and summaries using Natural Language Processing techniques. The idea is to create interesting, funny, and/or timely new stories based on actual real-time events as reported by news sources of all kind across the Internet. The mash often produces comical stories such as “Princess Di Dancing with the Polar Bears at Golden Gate Bridge”. How would it come up with such a story? Well at that time of our scrambling there was probably some unrelated news about Princess Di, Dancing with the Stars, Polar Bears, and Golden Gate Bridge. We randomly select and break apart each story, scramble them up, and rebuild them to construct amusing and well structured stories. The magic is in the reconstruction. The engine is still in beta and thus the scrambled Title/Summary text still needs some refinement, but it is worth a bookmark and glance every day or so, as it already generates some pretty interesting mashups several times a day.
You can simply poke around and glance at a few news entries. Or if you feel like digging in more, you can rate some stories and/or comment on them. Better yet, you can write your own version of the scrambled story using the references provided for that news. On top of all that, you can provide feedback and become a true NewsXperiment star
Roadmap and the challenges during development?
As of Aug 3rd, 2008 the basic functionality of an interactive website is in place.
Scrambler Engine, News Upload, and Admin level CRUD operations, Visitor Comments, Visitor Rating are all implemented.
NewsXperiment hits Flickr per news item and grabs a relevant image.(this is the fun part)
Utilizes NTLK libraries within the scrambler engine that runs offline.
The generated output is a “zipped pickle” file and it is uploaded to Google AppEngine using appcfg.py.
Runs on Google AppEngine.
Uses Django for server-side rendering.
Uses Yahoo! User Interface (YUI) Library for client-side JavaScript and CSS.
What’s in the bag for near future development:
Sometime in the near future, a “Fork This News” feature will be added. “Fork This News” feature will enable the visitors to make a copy of an existing news entry, and write their own version, which can be rated, commented and yet again forked over and over again. Currently, visitors can simulate doing the same thing using the “Comment” form assigned to each news item.
A better front-end design would be nice, but I highly doubt I’ll loose sleep on it. I absolutely wouldn’t mind if someone with good design skills taking a stab at it.
NewsXperiment surely needs a new logo.
I’ll leave the challenges and the technical mumba jumba to another post… Any feedback is appreciated. Please feel free to comment here. If you prefer email communique, see “About” link on NewsXperiment.com for contact info.
I recently received an email from one of registered users of the ocszone.com
The email follows;
from a user….
At Joomla Extensions it is said that moseasymedia is free,but here at your download you demand credit card details and want to chargeThis is bull shitI’ll fucken tell everyone I can about your con
fucken piece of shit
This email is obviously from someone who does not know what GPL means. Indeed, let me rephrase; from someone who does not know how to read.
ocszone.com clearly states – in multiple places- that the software we develop and distribute is free of charge unless stated otherwise. Most of those notices are in different colors and bold, so that they will be eye-catchy. And we do NOT demand any private and financial information.
pubhubsubhub is a data (news) aggregator which can deliver your “topics of your interests” to you as Instant Messages.
You can almost consider it as an RSS Reader with the convenience of an IM.
Anytime I come up with (or come across an) idea, if it’s Web applicable, Google’s App Engine has been the platform of my choice. While glancing through the SDK Documents I came across Prospective Search and further more reading landed me on the XMPP.
Long story short, after reading through the docs and looking at a few samples, pubhubsubhub is born to possibly turn into something more than a news aggregator which is capable of delivering the subscribed topics (search results) as instant messages.
It is quite neat with Adium plus Growl notifications. While on GTalk widget in GMail, I’d recommend to have the pubhubsubhub popped out.
Currently, the search data is coming from approximately 1000 sources with high recent-popularity. That’s why I personally find it very useful to keep an eye on the recent / trending events.
It’s been about 9 months since the first release of MiaCMS and since then we had four public releases. MiaCMS 4.6.4, MiaCMS 4.6.5, MiaCMS 4.6.5 SP1 and finally MiaCMS 4.8 .
We’ve been sticking to our roadmap and working hard to get those in one by one, as our time allows. We have almost 500 commits in our svn. That should be an indicator for some level of activity on the MiaCMS front.
So, what’s been happening over at the Mambo world?
The following line is from the 4.6 branch of Mambo.
r1752 | elpie | 2008-10-01 23:42:57 -0700 (Wed, 01 Oct 2008) | 1 line
Two interesting things about this SVN log line. It is pretty old (as of January 20th, 2009), and the committer. We all thought, elpie left the Mambo world to not to come again.
Another fact is the 4.7 branch of Mambo. It’s still closed to public. When we forked MiaCMS in May, 2008, we pretty much forked what Mambo 4.7 was at that point in time. If that source is still in the works by the Mambo Team, what possibly they might be adding??? Or, perhaps they gave up on Mambo 4.6, and Mambo 4.7. Perhaps, they are working on the Mambo 5.0, which Chad initiated long long time ago – I doubt it. Ah!, not a single commit in that branch! I guess, Mambo Team is not developing Mambo 5.0 either.
Once in a while, I go over to Mambo Forums and check out what’s going on. Not much ! Just a few survey posts, a graphics competition which keeps getting extended, and some dummie chat stuff. There a few help requests too.
No mention of elections, board of directors or certain legalities. As far as I know, their election deadline set by Australian Government passed months ago. Are they not an illegal non-profit organization yet?
For me, it’s very difficult to grasp, why would anyone go for Mambo at this point. Old & non-maintained code, bad publicity, bad management, no roadmap, no future, lots of legal issues etc. You name the negativity, Mambo has it.
You can leave all the hardcore architectural stuff MiaCMS went through, since the fork. Just look at the brand new goodies; Content Revisioning, OpenID, RESTful API, RSS Enhancements, AKismet Comments, Enhanced Charting, MOSTlyCE upgrades and more…
See it folks! Mambo project is old, outdated, it’s not maintained, it’s essentially dead. If you want a Mambo like CMS, with the “power in simplicity” motto, go MiaCMS, which is still very actively developed and maintained by the same team that once brought glory to Mambo.
MiaCMS will have some very interesting news coming up in the following weeks. I tell you now; next-gen MiaCMS will be one kick-ass project.
Save Mambo from her misery, and switch over to MiaCMS.
After getting a few very interesting emails, I decided that I should provide a bit of a historical and informal insight to those who might be curious about what that “Save My Ass” may be.
SaveMyAss (Save My Ass) a convenient way to clear the Call and/or the SMS records on the Android based phones. The delete (or purge) process runs automatically upon the app launch without further user intervention. The app either deletes a preset number of messages set in the preferences, or by age (i.e. last 10 min, last 2 hours etc.).
Why did I build this app? Just because. *this is the easy answer (i could’ve said, just for shits and giggles )
Why did I build this app? Just because, I thought it would be an interesting challenge. I worked extensively with Android Messaging app and SMS internals during the development of “Txtract – SMS Backup for Android“, “Save My Ass” would be the icing on the cake in terms gaining more expertise on Android.
Why did I build this app? About five months ago, I was about to get a ticket because a polica officer thought that I was on the phone while driving. I swear, I was not on the phone, or was not doing anything on my phone. I was pulled over, and I had to show the officer my call and message history to convince him that he stopped me for no reason. That incident was indeed the spark that made “Save My Ass” built. I don’t want to give anyone any ideas about how they may use the application. The rest is up to the users’ imagination
About the “Save My Ass” name; it’s supposed to be just funny and provocative. Nothing more, nothing less. I can only hope that no one will find the name offensive.