Dave Johnson on open web technologies, social software and software development
Joe Cheng posted another entry in his series explaining the details of AtomPub support in Windows Live Writer (WLM), titled WLW+AtomPub, Part 2: Authentication.
Wondering what WLM looks like? Travelin' Librarian has a nice set of screen-shots of WLM on Flickr including shots of the installation process, HTML mode, preview mode and more. Looks pretty sweet.
<img src="http://farm2.static.flickr.com/1321/1366689098_8323af9281.jpg" alt="Screen-shot of Windows Live Writer" />
Dave Johnson in Blogging
01:08PM Oct 20, 2007
Comments [0]
Tags:
atompub
blogging
microsoft
I spent a fair amount of holiday time trying to figure out how to share and backup the important files on our various home computers. The solution I settled on was geeky bordering on goofy:
For documents I use Subversion. On each computer, each user's files are kept in a directory that is under Subversion source code control. Since nobody else in the family knows about Subversion (yet), I have to visit each computer periodically and commit any new files or changes. I had hoped that approach would work for all of my files, but Subversion on the Slug is way too sluggish when it comes to big files.
So, for photos and other big binary files I use the Slug as a simple file-server. I make sure my photos and videos are organized into directories that are roughly DVD-size directories (i.e. about 8GB) and I periodically copy them to the Slug and make DVDs for off-site storage.
And finally, for full backups I use disk "cloning" software. Every month or so I use Carbon Copy Cloner to make a full-disk backup our two Mac laptops to a USB drive.
Sounds like a total pain in the ass doesn't it? But a growing number of folks have multiple computers and piles of photos and videos to backup, so my problems are far from unique. That's why I think Windows Home Server is going to be a hit.
It's a server appliance with no monitor or keyboard. You just plug it into your home network, put it in a corner and it solve all of your PC backup problems. It quietly makes full-disk image backups of each of your Windows PCs and it gives you a place to share files with other folks at home and over the net. It's of no use to me since most of my home computers run some form of UNIX, but 95% of the world is hooked on Windows -- they're gonna want this thing. Check out Paul Thurrott's Windows Home Server Preview for more details.
Dave Johnson in Microsoft
05:11PM Jan 09, 2007
Comments [3]
Tags:
microsoft
server
windows
Robert Burke: And it was kinda cool to be the Microsoft guy running Apache and PHP on his laptop :)I'm sorry I missed that talk.
Dave Johnson in Microsoft
08:58AM Jun 29, 2006
Comments [0]
Tags:
apache
conferences
microsoft
Joe Gregorio:This is just like WinFS.
Except that it is shipping today.
And it just works.
And it doesn't require an upgrade to your operating system.
Dave Johnson in Microsoft
07:03AM Oct 19, 2004
Comments [0]
Tags:
google
microsoft
Microsoft flexes more open-source muscle | CNET News.com: "FlexWiki is the third piece of Microsoft code that the company has released this year under an open-source license, all under the Common Public License (CPL). In April, Microsoft posted its Windows Installer XML (WiX) to SourceForge.net, following up a month later with the posting of the Windows Template Library (WTL) project."
Dave Johnson in Microsoft
05:20PM Sep 28, 2004
Comments [2]
Tags:
microsoft
opensource
wiki
How do you do it? I need to provide some examples to show how to parse RSS with Java and C#. I have written simple parsers using the common XML parsing techniques such as DOM, SAX, and Pull. I have also written some examples that use parser libraries, but I have yet to find a good and free RSS parser library for .Net. Lazy-web, please help me out here.
When you assume...
If you assume that RSS is XML and you are just interested in getting
titles, decriptions, links, and dates then it is pretty easy to write a
simple parser that can handle most forms of RSS including RSS 1.0, RSS
2.0, and some forms of funky RSS. If you to handle more than those
basic elements, then I recommend that you use a parser library.
Parser libraries
Python programmers are blessed with a great newsfeed parser library: Pilgrim's regex-based Universal Feed Parser which can parse any feed, even if it is not valid XML. I don't think Pilgrim's parser will port easily to the Java version of Python Jython, because Jython is missing some important Python libraries and Jython uses a Java regex which is different from Python's built-in regex. The same thing probably goes for the .Net version of Python IronPython. By the way, Lazy-web, would you please port Pilgrim's parser to Jython?
So, Java developers don't have the Universal Feed Parser, but we do have two active projects that are developing full featured RSS (and Atom) parsers: Informa (used by Javablogs.com) and Rome. .Net developers have RSS.Net, but it is incomplete and development seems to have comletely stagnated back in November of 2003.
So how do you parse RSS with .Net? I started looking around and digging into source code. I found that Dare built his C# based RSS parser for RssBandit on top of an SGML parser. Joe built his C# based RSS parser for Aggie using good old System.Xml. I guess you just have to do it by hand, so here goes...
My examples
Now it's time for the lazy web to point and laugh at my feeble efforts to build simple RSS parsers in C#. I have two examples for your ridicule. After you are done laughing, please, .Net heads, help me out and tell me what I am doing wrong and where I can make improvements.
First, here is a simple C# RSS parser method that uses a DOM based approach. It extracts the basic elements of title, description, link, and pubDate from the channel and item levels and it puts them into a dictionary (just like Pilgrim's parser does). It can handle RSS 1.0, RSS 2.0, and some forms of funky RSS. Have a look:
public IDictionary ParseFeed(String fileName) {
XmlDocument feedDoc = new XmlDocument();
feedDoc.Load(fileName);
XmlElement root = feedDoc.DocumentElement;
string defaultNS = null;
string contentNS = "http://purl.org/rss/1.0/modules/content/";
string dcNS = "http://purl.org/dc/elements/1.1/";
string xhtmlNS = "http://www.w3.org/1999/xhtml";
if (root.Name.Equals("rss")) {
defaultNS = null;
}
else {
defaultNS = "http://purl.org/rss/1.0/";
}
XmlElement channel = (XmlElement)root.GetElementsByTagName("channel").Item(0);
IDictionary feedMap = new Hashtable();
feedMap.Add("title", GetChildText(channel,"title",defaultNS));
feedMap.Add("pubDate", GetChildText(channel,"pubDate",defaultNS));
feedMap.Add("dc:date", GetChildText(channel,"date",dcNS));
feedMap.Add("description", GetChildText(channel,"description",defaultNS));
feedMap.Add("link", GetChildText(channel,"link",defaultNS));
XmlNodeList items = null;
if (root.Name.Equals("rss")) {
items = channel.GetElementsByTagName("item");
}
else {
items = root.GetElementsByTagName("item");
}
IList itemList = new ArrayList();
feedMap.Add("items", itemList);
for (int i=0; i<items.Count; i++) {
IDictionary itemMap = new Hashtable();
itemList.Add(itemMap);
XmlElement item = (XmlElement)items.Item(i);
itemMap.Add("title", GetChildText(item,"title",defaultNS));
itemMap.Add("link", GetChildText(item,"link",defaultNS));
itemMap.Add("guid", GetChildText(item,"guid",defaultNS));
itemMap.Add("pubDate", GetChildText(item,"pubDate",defaultNS));
itemMap.Add("dc:date", GetChildText(item,"date",dcNS));
itemMap.Add("description", GetChildText(item,"description",defaultNS));
itemMap.Add("content:encoded", GetChildText(item,"encoded",contentNS));
itemMap.Add("body", GetChildText(item,"body",xhtmlNS));
}
return feedMap;
}
private string GetChildText(XmlElement element, string childName, string namespaceURI) {
string text = null;
XmlNodeList nodeList = null;
if (namespaceURI != null) {
nodeList = element.GetElementsByTagName(childName, namespaceURI);
} else {
nodeList = element.GetElementsByTagName(childName);
}
if (nodeList!=null && nodeList.Item(0)!=null) {
if (nodeList.Item(0).FirstChild!=null) {
text = nodeList.Item(0).FirstChild.Value;
} else {
text = "";
}
}
return text;
}
And here is the same thing, but using a pull-parser based XmlTextReader approach:
public IDictionary ParseFeed(String fileName) {
XmlTextReader reader = new XmlTextReader(fileName);
reader.WhitespaceHandling = WhitespaceHandling.None;
IDictionary feedMap = new Hashtable();
IList items = new ArrayList();
IDictionary itemMap = null;
feedMap.Add("items", items);
while (reader.Read()) {
bool isStart = reader.NodeType.Equals(XmlNodeType.Element);
bool isEnd = reader.NodeType.Equals(XmlNodeType.EndElement);
if (isEnd && reader.Name.Equals("item")) {
itemMap = null;
}
else if (isStart && reader.Name.Equals("item")) {
itemMap = new Hashtable();
items.Add(itemMap);
}
else if (isStart && itemMap!=null
&& reader.Name.Equals("title")) {
reader.Read();
itemMap.Add("title", reader.Value);
}
else if (isStart && itemMap!=null
&& reader.Name.Equals("link")) {
reader.Read();
itemMap.Add("link", reader.Value);
}
else if (isStart && itemMap!=null
&& reader.Name.Equals("description")) {
reader.Read();
itemMap.Add("description", reader.Value);
}
else if (isStart && itemMap!=null
&& reader.Name.Equals("content:encoded")) {
reader.Read();
itemMap.Add("content:encoded", reader.Value);
}
else if (itemMap!=null && reader.Name.Equals("body")) {
reader.Read();
itemMap.Add("body", reader.Value);
}
else if (isStart && itemMap!=null
&& reader.Name.Equals("pubDate")) {
reader.Read();
itemMap.Add("pubDate", reader.Value);
}
else if (isStart && itemMap!=null
&& reader.Name.Equals("dc:date")) {
reader.Read();
itemMap.Add("dc:date", reader.Value);
}
else if (isStart && reader.Name.Equals("title")) {
reader.Read();
feedMap.Add("title", reader.Value);
}
else if (isStart && reader.Name.Equals("description")) {
reader.Read();
feedMap.Add("description", reader.Value);
}
else if (isStart && reader.Name.Equals("link")) {
reader.Read();
feedMap.Add("link", reader.Value);
}
else if (isStart && reader.Name.Equals("pubDate")) {
reader.Read();
feedMap.Add("pubDate", reader.Value);
}
else if (isStart && reader.Name.Equals("dc:date")) {
reader.Read();
feedMap.Add("dc:date", reader.Value);
}
else if (isStart && reader.Name.Equals("image")) {
// skip images
while (reader.Read()) {
if (reader.Name.Equals("image")
&& reader.NodeType.Equals(XmlNodeType.EndElement)) {
break;
}
}
}
}
return feedMap;
}
Have some better examples of parsing RSS with .Net? Please point me to them.
Dave Johnson in Microsoft
04:55AM Sep 01, 2004
Comments [10]
Tags:
microsoft
I've been using SysInternals freeware Windows Process Explorer and other Sysitnernals utilties for years now, but I never noticed this one. AutoRuns "shows you what programs are configured to run during system bootup or login" and allows you to delete or disable any of them. Via Jonathan Hardwick.
Dave Johnson in General
07:37AM Aug 28, 2004
Comments [0]
Tags:
microsoft