Semantic Web
« Previous Entries Next Entries »
Kevin Kelly gave this talk at TED in 2007. It’s worth watching.
He touches on a number of things ranging from history of the Internet and Moore’s Law to the future ubiquity of Cloud Computing and Kurzweil‘s “Sigularity.“
He covers concepts like the Semantic Web, and the give-and-take between privacy and participation with relatively light language that any lay person should be able to understand. This is an interesting and entertaining little presentation. Thought I’d share.
[youtube=http://www.youtube.com/watch?v=yDYCf4ONh5M]
Here’s my dilemma. I have a ton of bookmarks on my Del.icio.us account. I love using an online bookmarking system. But still, Delicious and others’ systems for organizing bookmarks don’t really help with a need I bet most users have: Tag-Optimization.
What we need are tools for analyzing and perfecting the organizing of bookmarks. Every one of these systems like Delicious, Furl, StumbleUpon etc, have the same problem: user-submitted tags are bug-y!!! The engine of the platform needs to guide the users toward better tagging! Basically, we need built-in systems for finding the types of redundancies and other tag-errors that we all have. We need debugging software, so our bookmarks can become good, clean representations of how web-users feel about various web resources. ”Suggested Tags” and “Popular Tags” are great time-saving features but I’d like to also have a tool for correcting tag-cancer.
These software offerings, if/when they finally exist, are going to make it increasingly more easy to harmonize user-submitted value from folksonomies with the ‘
Semantic Web,’ which is right around the corner.
Some examples of areas where I think a robot could help users to clean up tags are:
- Redundant Tags. Usually just alternate tenses of the same word (like the plural and singular form) but also synonyms. Example: Image, Images, Picture, Pictures, Pix
- Arbitrary Capitalization. HTML vs html etc.
- Vagueness. Like los or awesome (wouldn’t it be safe to assume that all the things you bookmark are ‘awesome’ to you?’).
This is a screen-shot of my tagging screen from Delicious. I added the red scribbling to point out just a few of the problems my tags have.

Del.Icio.Us Tags Gone Wild
On several occasions, I’ve set out to clean up my tags manually, but I’ve never made it very far. It’s just too much work.
Maybe the coming overhaul to Del.Icio.Us will ad some of these needed features, although somehow I doubt it.
I’ve heard of the MOAT (Meaning Of A Tag) Project, and perhaps this could save us, but like many other ‘Semantic Web’ projects, I haven’t found a way, as a lay person, to utilize it. At some point down te road, maybe someone will make a Delicious-MOAT-erizer Web-App that will clean-up-shop-by-proxy and make the metadata available to the Semantic Web.
[googlevideo=http://video.google.com/videoplay?docid=658439829337469958&q=semantic+web&ei=LjZcSPa1I6TAqwPK4bn_DQ]
This a comment I posted on the Nodalities blog (or I think I posted it. The form submit resulted in a blank page)
Quote:
It’s ironic, really, that the Semantic Web should struggle so much with semantics!
The problem is that if we present a mixed, complicated, and difficult concept forward, the journalists and media commentators are not going to be able to sort out the tangle of meanings for us. They will present an (over)simplified, half-understood message to the rest of the world. When even a brilliant communicator like Tim Berners-Lee’s message gets scrambled, maybe it’s time to take stock in how we present the Semantic Web, especially to the general media. Maybe, a set of metaphors could help us present these:
The semantic web is a platform (one we already use frequently)! The semantic web is a layer of connectivity (like a concentric ring around the web itself). The semantic web is a series (more than one thing) of enablers (it makes possible, rather than it does)
(me:)
I think there’s a big problem, obviously, with the phrase “Semantic Web.”
It’s easy for Press to confuse the intentions of SemWeb with those of Natural Language Processing.
Talk about ironic:
NLP is really more about human “semantics” than The Semantic Web is. SemWeb technologies are only really semantic by comparison to the HTML-Web, and they’re only really “semantic” from a MarkUp/database/programming point of view, and still, only in comparison to older/existing systems.
As Tim Berners-Lee has pointed out, the name “Semantic Web” wasn’t the best choice of names, but it’s too late to change it. “The Data Web” or “Web Of Data” or “Linked Data Web” or many other names for it would be more accurate and less conducive to misrepresentation than “The Semantic Web” is. But “Semantic Web” has already stuck, and I doubt anyone is going to change it.
Fortunately, “The Semantic Web” sounds lofty enough for people to think it probably is going to be the next big thing. Unfortunately, the name is deceiving to most people and the technologies would probably seem more or less trivial to them anyhow.
SemWeb is a movement that would ideally be taking place among inspired, pro-active developers, but unfortunately, devs are too comfortable with the tools at hand and there’s no visible eminent market force in the development field pushing for movement beyond the same old skill-sets and practices on the ground, at least for most businesses and the programmers they hire.. For this reason, we should be glad about all the “Web 3.0″ hype, and work to inform the press and the public between the lines where and when we can.
For those of us that understand what “The Semantic Web” is, it is our duty to evangelize RDF and MicroFormats where and when we can.
Soon we will be forgotten. Soon we wont need to call the “semantic web” anything so this conversation will be meaningless… I hope.
Update… this was actually news back in January. Coincidentally, today it was announced that Comcast is buying Plaxo. Goodbye Plaxo. Nice knowin’ ya.
Got the rumor tip from Scoble (there’s no real info there so don’t bother)
Plaxo? Are you listening? Keep doing what you’re doing, stay behind the scenes, work on enabling users to publish their own data, at will, in Semantic Standards as they become timely (now?) and stay independent of the little tug-of-war between closed, albeit increasingly API-enabled social apps. You’re better than them! Hang in there and you’ll be worth way more! Don’t turn to the dark side!
Competition for traffic will get everyone using RDF and Microformats soon enough… Semantics are like SEO 2.0… The next bandwagon everyone will want to pay way too much for.
Plaxo, you’re in the perfect spot to make money on this. Think Virtual Private Networks, Semantic Publishing to the Web, and Semantic Productivity Tools at home.
Seriously.
In the suggested reading section of the page for the DIY Rel=”Me” project over at dataportability.org’s wiki, There’s a link to this blog post, which is an attempt to explore the usefulness of rel=”me” to the regular old web user. The article is slightly tunnel-visioned at what you can or can’t do with your browser to exploit MicroFormats. Of course, being able to detect locations or personal contact info thru a browser extension is useful and I’m all for it, but beyond a few obvious exceptions like those, The Semantic Web, MicroFormats included, wont be much use to us at the level of the browser. We will still need Web based portals or “Libraries” or “repositories” or “Catalogs” or what have you, to connect to, in order to really take advantage of this stuff. Semantic markup on pages is great. RSS is an example of how a little bit of semantics can go a long way. But what’s of greater significance is the idea of the Web Of Data, where resources are “semantically” interconnected, by leveraging information that’s mapped to the domain of knowledge where it’s useful and the relationships between resources are also specified in a machine-understandable way.
Rel=”me” is the equivalent of saying “The person represented by this URL is the same person as the person represented by this other URL.” Taking that into consideration, imagine how this would effect the experience of searching the “Web of Documents.” I argue that if enough of us implement rel=”me” (or other microformats or RDFa) in our HTML pages, we will empower the Googles and Yahoos to take advantage to knowledge expressed by this markup. So let’s do it!
Quotes from the Article I mentioned:
“…So assuming that you went through the trouble to write up your HTML with rel=me, what next, where is that information actually consumed. I don’t think the 2 most popular browsers (IE 7 and Firefox 2) at this time have native support for XFN, I hear Firefox 3 is suppose to have native microformat support but I haven’t looked for it and if it is there, it isn’t immediately obvious to me. The closest thing I can find is a Firefox plugin called Operator. Operator is a microformat capable reader and for the most part seems to be able to consume most of the above microformat standards except rel=me, kind of odd but kind of understandable…”
“…At this time, I can honestly say that XFN rel=me proliferation is limited and experimental at best. It would take a while for mass adoption to happen and requires a lot of user education, adoption by popular social sites like Facebook, MySpace, etc, and native browser support…”
I commented there and when I take the time to write a long comment out, that isn’t something I’ve already written in so many words here, I like to steal my own comment and put it here for anyone who reads my blog. My response:
I felt like I had to chime in and point out that the point of MicroFormats or RDFa isn’t really to make an overnight change in how we use the Web. It’s to create a backbone of linked data so that as Search Engines and other “Libraries” begin to have stores of these relationships between documents and other resources available to work with, they can begin to improve their services. It will be nice when Search is only partly based on scanning for text-strings or combinations of words.
If you were looking for Andrew in Sebastopol, CA, how would you do it? Perhaps you’d google “Andrew Sebastopol CA…”
But what if you could specify that you are looking for a person?
What if you could specify geocoding info or otherwise specify that Sebastopol is a town in Northern California?
What if you could filter your results by the time web-pages were created or filter by domain specifications (like show me wiki articles first or show me all MySpace profiles) or filter by type of site like say, show me blogs only, and finally, and this is where rel=”me” comes in, what if you could specify in your search results that you want to see every other document that is an expression of the same person, once you have selected from your query, a person named Andrew who lives in Sebastopol, CA? This is what it’s all about. It works because links work backward. In other words, you can already say “show me all the pages that link to this thing…” but what about being able to say “show me all the pages linking to this Twitter page that link using rel=”me” or better yet, show me all the pages linked to with rel=”me” from any page that links to this twitter page with rel=”me” …And so on…
The Web is becoming a library. By adding microformats and other semantic markup to our documents, we are making it possible for decent “card-catalogues” to be built, whether they’re being built by google, yahoo! or the guy down the street.
A weekly roundtable discussion about the DataPortability Project in specific, and efforts involved in data portability in general. The show is produced and hosted by J. Trent Adams and Steve Greenberg.
PodCast is HERE
I recommend Episode 7
QUOTE:
We kick off episode 7 of the DataPortability: In-Motion Podcast with the news of the week that MySpace launched “Data Availability” with Yahoo!, eBay, Photobucket, and Twitter. Following immediately on their heels was the announcement that Facebook is releasing “Facebook Connect”, an extension of their 3rd party API providing deeper access to their user’s data.
We’re also joined by Brady Brim-Deforest, founder of Human Global Media, talking about the DataPortability Legal Entity Taskforce. He provides a good overview and update on the process underway to formalize the the project under a recognized legal banner.
The featured interview segment is with Danny Ayers, Semantic Web Developer at Talis. He touches on moving from document linking, through microformats, to feature-rich RDF modeling to identify portable data. Contrary to popular belief, he dispels the myth that it’s hard to migrate from a standard SQL data representation into addressable semantic objects.
Danny regularly posts on the following sites:
Also mentioned in the episode:
Planet RDF
I heard about this through Lawrence Lessig’s blog. Professor Lessig is taking the month of May off, and off the grid, which I applaud him for.
What this web app does is allow you to make links that, through the free Apture service for your site, link to numerous resources, all previewable via the same sort of javascript popup you get from Snap or the ZitGist “zLinks” plugin.
You must see this in action. This is inspiring. It shows how much more dynamic web pages can and will be in the near future. I’m a bit sick of the over-use of javascript, ajax, whatever you want to call it. It tends to be resource-heavy on your machine. This is an exception.
[youtube=http://www.youtube.com/watch?v=TznonD_OGXw]
I wonder if these guys are going to implement any Semantic technologies into the data they store… I wonder if they’re going to make deals with bookmarking services like del.icio.us… All my words could automatically be links to mini-libraries of items I’ve bookmarked! It’d look a little ugly given the current style conventions but hey. Let’s change those.
It’s interesting to me to ponder how this non-semantic-web service, because it’s also a library/bookmarking tool, could become hugely useful to the Semantic Web as they snatch up web user’s resources/web-bibliographies.
Oh man. This is a hot item!
[youtube=http://www.youtube.com/watch?v=6eGcsGPgUTw]
Great job, Danny. That’s funny.
I’ve mentioned before how increasingly the ‘Live Web’ or ‘Blogosphere’ (or whatever you want to call this thing) is being infiltrated by Robot Blogs. What they appear to be doing is crawling the web and scraping excerpts of blog posts and reposting the excerpts, linking back to where it came from. They usually say:
“[KeyWord] wrote an interesting post today”
Since they link back to the blog post they scraped, they show up as a trackback in the comments area of the original post. This way, the unsuspecting blogger is linking to the fake blog. The fake blogs seem to be set up in an attempt at monetizing traffic via adsense ads.
I googled the phrase “wrote an interesting post today” and the top hit was (I probably am the top hit now) some blogger talking about filtering any comment that contains the phrase “wrote an interesting post today.”
I had decided to change my little tagline thingy to this exact phrase as a sort of inside joke for bloggers, but found myself wondering if being associated with that phrase will adversely effect my findability. Perhaps Search Engines or Spam Filters will begin to look out for that phrase?
Already, I bet there are tons of bloggers who filter out comments containing words like “viagra” or “casino,” assuming that there is absolutely no context in which these words could be used in a legitimate discussion. The fact that I am using those words here is proof that there is such a thing as a legitimate discussion which contains them.
Filtering for a word or phrase seems to me to be a slippery slope, especially if we’re talking about Search Engines, since they act as our main interface to the Web.
Google: Please don’t hate me because I said Viagra. I’m not a spammer.
« Previous Entries Next Entries »