The ‘Semantic Web’ is not nearly as hot of a topic as it was a few years ago, but if you remember, some of the efforts being made back in the old days (2008?) had to do with embedding semantic identifiers into regular old HTML. The two examples that come to mind are RDFa and Microformats. I haven’t heard a lot of buzz about embedded ‘linked data’ in HTML lately, but I heard today that a new project, called schema.org has been launched to enable developers to add markup to sites which will help search services glean meaning from markup. Apparently, Google, Microsoft and Yahoo! are all on board with this project.
I guess we should call this Keywords 2.0
Anyway, they have a whole taxonomy of ‘things’ laid out. Check out “The Type Hierarchy” page. A great start.
I guess this means that a lot of SEO people are gonna start getting work again. It’ll be interesting to me to see if people start actually putting this stuff into their CMSs. I suspect not. I suspect that the kinds of companies that have such rich data that they can just rebuild the hooks they use as their apps render HTML will already be benefitting enough in organic search that they wont find a need to actually clutter up their code with this stuff. I mean I find it very unlikely that a site like Disney’s would get out-ranked by some spammer because the spammer used these newer HTML attributes.
Then again, the fact that the major players are on board with this makes me wonder if there isn’t a reason that’s profitable to search companies to finally start getting rid of all the garbage from SERPs. Touch-screen finger fatigue? Even so, it’s all the damn spammers in eastern Europe that’ll have the resources to recode everything, at least in the near future.
Above all, I’m glad to see any attempt at making information more granular. And deep down, I still want the universal distributed database we were all so excited about back in web2.0 when the semantic web seemed like it was on the horizon, before facebook and the mobile app-o-sphere took over.
What do we call this current era? The API-o-sphere? The Walled-garden-o-sphere? Maybe we should just call it Facebook.
The maximum number of friend requests you can send per day, as far as I can tell, is 100. I may be slightly off. Here’s the thing:
When you run out of friend-requests, it doesn’t tell you! So you can end up spending an hour (or hours) adding people without realizing you are accomplishing nothing! The way to tell is you refresh that user’s page, and it doesn’t say ‘friendship requested,’ or you just send the request twice and the second time, it will say the same thing it did the first time: “Do you want to make friends?” instead of “you already requested this person’s friendship” (or whatever the specific wording is, you get the point.)
Diclaimer: I may be blocked or something rather than just hitting a preset speed-bump. I did sent a lot of requests. I’ll know tonight at either Midnight somewhere in the US or in the UK.
Recently I found an awesome user group on Last.FM that had showcased one of my tracks as the “sound of the year…”
So I figured I’d better add all of the members as friends.
I know that back in the day, MySpace had a policy that allowed somewhere around 400 actions per day, that is to say, if you sent 400 friend requests, you wouldn’t be able to message anyone or anything…
LastFM doesn’t seem to have much trouble with unwanted spam. I have a few friends on Last.fm that spam me, but it’s all good spam (decent or great music to check out).
I’m basing my number, 100 on the fact that I got through two, fifty-user pages before the requests stopped working.
UPDATE: Two days later
After Midnight PST, was able to send 100 more friend requests.
But after Midnight the following night, I was only able to send around before they stopped working. I added a thread on Last.FM’s community/support forum here.
Maybe someone will shed some light on this. Search engine results for this problem are horrible.
The following is a bunch of predictions. Mark my words. Three areas to pull out your wallet for.
Personal Web Hosting/Cloud/Sync/Backup Services – I’m not sure what to call this space that I think we’ll be seeing a lot of. I don’t believe that these kinds of services will be bundled with mobile accounts anytime soon, but that’s clearly what will happen. The definition is this: Add-On ISP-like services that make mobile and desktop apps work together more effectively. This would include backup services and services that bridge gaps across the various hardware networks we use.
Genealogy – The Baby Boomers love this stuff, and actually so do humans in general. Who doesn’t want to know their own family history? And with DNA analysis becoming more and more standardized, I think that Social-Media-Driven Genealogical Information will probably be mashed together with known hereditary data to create really compelling information services for average people. The word “Rich” comes to mind but that’s really in the hands of designers and visionaries. Imagine what’s going to happen in this space. It blows my mind.
Library Sciences Related Anything – The so-called “Public Library” is probably about to explode into something much more tangled with our daily lives. I believe that tax-funded Public Libraries are increasingly getting closer to being able to easily use cutting edge Information Technology to serve the public. The abolition of hard-copy card catalogs went slowly. But we’re in the age of Moore’s Law. It’s no stretch of the imagination that soon there will be title-to-isbn translators that cross language barriers and so on… But that’s just the beginning. Imagine the Public Library as place that has cached, categorized databases from all sorts of sources, and Librarians as people helping you to mash data together (while you’re still at home in your underwear or on a train heading to work) …This idea is so hard to see for some people. I could go on for pages about the possibilities. And for you asshole cynics, remember: Facts Cannot Be Copyrighted. “(b) In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.” …Libraries are worth so much to us as people. And when they merge into a global archive of ‘verified’ sources, we’ll really start to see the Web’s potential.
What the Semantic Web (now officially called any number of other things besides that) needs in order to become mainstream, in my opinion, is people and the connections between them. The phrase “The Social Graph” comes to mind a la Brad Fitzpatrick‘s once famous, but now all but forgotten manifesto which even Tim Berners-Lee eventually commented on.
The Semantic Web would catch on if it was seen as even remotely useful by the young people who are most likely going to be building the next big thing on the web.
The beautiful thing about the Web2 era is that highly useful tools can sprout up overnight simply because of the desires of more or less ordinary people with no credentials or affiliation with a company. Everyone knows someone who’s a programmer. The next big social software application just might come from the bedroom of a teenager. There is hardly any barrier to access anymore. This is why Web 2.0 happened. A new tool or service doesn’t need a business plan and a data center to launch and go viral.
The trajectory of innovation throughout the last five years or so, the “Web 2.0″ years, has been around capitalizing on people, the content they create, their interests, and the value added by crowd-sourcing. The benefits in the social media space are clear from both the perspective of normal end-users, as well as giant companies. Mostly, these benefits are about filtering noise and finding relevance on the user-side and on the giant company side, gathering metrics, targeting messages and acquiring free content. The SemWeb standards have a lot to offer the Social Media realm, dare I say, probably even more than CSS with rounded corners does (I hope I’m not offending anyone here).
But the way things are today, for most programmers, implementing SemWeb standards is a lot of extra work with no immediate benefit. Why not just use MySQL or cook up a new XML format?
So why are these standards being completely ignored by the coders on the street? RSS took off. Why not FOAF? I think it’s because there’s no useful directory of URIs for people. There are lots of SEmWeb geeks who have URIs, but the kids on MySpace and FaceBook don’t have URIs or FOAF files. And those kids’ eyeballs and participation are worth real money!
One fine day, back in 2006, Tim Berners-Lee came down from the mountain and gave us a commandment (or at least he logged into his blog and made a suggestion):
“Do you have a URI for yourself? If you are reading this blog and you have the ability to publish stuff on the web, then you can make a FOAF page, and you can give yourself a URI.”
“One meme of RDF ethos is that the direction one choses for a given property is arbitrary: it doesn’t matter whether one defines “parent” or “child”; “employee” or “employer”. This philosophy (from the Enquire design of 1980) is that one should not favor one way over another. One day, you may be interested in following the link one way, another day, or somene else, the other way.”
For those of you who don’t yet understand the idea of the Semantic Web, here’s the deal. If there’s one web-address that represents each person, place thing or idea, it becomes possible to crawl the Web (documents as well as databases) looking for links to that person place or thing. And if those links contain tags which specify the meaning of the links, the web-at-large begins to look more like a giant database. This is the “Web of Data” (in contrast to the “Web of Documents” we know and love). This is what people call The Semantic Web. So what’s stopping people from being in the “Web of Data” (AKA Semantic Web)? Like Tim Berners-Lee suggested, we need URIs for people. That’s where it all starts. Once there are URIs for people, and there are semantic links (ones that contain tags explaining what they mean) pointing at the those URIs, we can start making tools that use that data.
This is a fairly simple concept. And Berners-Lee makes it sound simple enough. Sure, we’ll all just give ourselves URIs and viala, the Social Graph will go Semantic. That sounds great but there are a few problems with leaving it at that.
Most ordinary people do not have websites or hosting of their own and instead rely on Social Networking Services’ profile pages for their web presence. This means that most people have no way of easily publishing themselves to the Web of Data.
For-Profit Social Networking services have a conflict of interest with regard to providing the Web-at-large with useful, granular “Social Graph” data. Instead we see APIs that give approved developers limited access to data. No love for the average joe like me that is not a programmer.
The Web currently has no trustworthy repository for facts about ordinary people. Trustworthy means not-for-profit at the very least. The closest thing we have is Wikipedia, but Wikipedia does not allow entries on ordinary, non-notable people. (keep in mind that the Wikipedia publishes the facts in its ‘info boxes’ in RDF one of the core Standards of what we have been calling ‘The Semantic Web’)
We need to start thinking of the Web more like we think of a Public Library, but completely decentralized and with infinite shelf-space. I think WikiMedia, the organization behind the Wikipedia is the best bet for a trusted librarian for all the information about normal people.
I think what is really needed right now is a non-profit run directory of people, possibly even modeled after the Wikipedia, especially when it comes to the concurrent DBPedia project, which publishes the contents of Wikipedia facts to the Semantic Web. Really I think because of WikiMedia’s established trust, they would be the ideal organization to do this. Wikipedia could simply have another layer which reveals non-notable results or ‘all results.’
As a major intaker of information about leading technologies, I am proud to say that at the time of the creation of this blog post, I am ahead of the game as far as declaring a change in the language we use to refer to the next phase of web evolution.
The term “web” has never been stronger. The “internet” goes on as something we mention almost every day. And the technologies that comprise the realm of what we have been calling semantic web, mainly markup standards, aren’t going anywhere.
But semantic web just fell out of favor as a [canditate for a] useful euphemism in our language. The moment this became obvious to me was a few weeks ago when I heard that Tim Berners-Lee spoke at TED and didn’t mention ‘the semantic web.’ A few weeks later I saw the video for myself and felt a certain sadness or abandonment when TBL talked about the geekiest dream ever, one that he created, without using the name I thought we had all agreed on for it, The Semantic Web. Instead, he used a different euphemism for the most awesome library system ever conceived. He called it “Linked Data.”
If you are a Semantic Web apologist like myself you might feel slightly deflated by a sudden change in terminology. I’m sorry. I’m sure TBL is sorry too.
But the reality is that “Semantic Web” is always going to be confused with Natural Language Processing, which is also a field of technology that is growing fast in its own right.
No sustaining buzz has really caught on with “the semantic web,” as a catch phrase, beyond us geeks that are already sold on the idea. Instead, we’ve recently heard more and more announcements (made usually by search companies) that include the word semantic as if the mere use of the word means that the company is doing something right.
The battle we’ve been fighting as SemWeb advocates is largely a battle for widespread awareness. TBL has said himself that the phrase semantic web wasn’t the best choice of words.
I’m sure TBL spent at least an afternoon considering what he might say to the audience at TED which arguably consists some of the most influential people in the world. I’ve concluded that he intentionally abandoned the phrase, in preparation for a brighter future in which the SemWeb technologies are no longer so easily confused with other technologies. We’ve changed our name.
If you feel the re-branding is unfair, consider who has more right to the word semantic, the Natural Language people or the Interchangeable Data Format people?
We lose.
Sorry. We need to move on.
The Semantic Web is now called Linked Data. It’s official. Take a deep breath, change your notes. And let’s move on as Linked Data enthusiasts, not Semantic Web enthusiasts.
I will lead this effort by removing the category of “The Semantic Web” from this site and replacing it with “Linked Data.” I’ll do it later this week. I need some time to say goodbye.
As a hardcore Linked-Data/Semantic Web Enthusiast for some time now, say since pre-2007 (back then, I didn’t know what to call it but I understood that it was possible), I can’t help but feel sometimes like it’s never going to happen. Sometimes a non-silo Web seems like a idealistic fantasy. Sometimes it seems like nothing is happening. During the first half of 2007, the amount of excitement in the Sem-Web Category of my feed-reader was high. Since then, however, the excitement level seems to have diminished quite a bit. Am I right?
I want to offer a few condolences and some evidence that the Semantic Web is not dead. In fact, I believe it’s still going to “happen.”
Tim Berners-Lee spoke at TED this year, apparently urging people to unlock their data, according to GigaOm (TED, please publish this video soon, OK?). TED has a quickly growing amount of influence in the mainstream from what I can tell. This is good outreach.
JavaScript support for querying more than one URL/Site/Database at a time is coming to a browser near you very soon, according to John Resig via this talk at Google. We’ve seen a lot of new APIs allowing programmers to access certain data from certain places, but more promising to me than these limited and proprietary APIs that have been sprouting up is how HTML itself is increasingly becoming more ‘semantic,’ if for no other reason, because it allows coders to do more interesting and elegant things with CSS and JavaScript… Where this is heading, I think, is toward a future where pages are basically designed to be scraped, a sort of Microformat revolution (albeit totally rag-tag). Once the cat is out of the bag, I really believe embedded HTML semantics will become more and more standardised because of the incremental benefits resulting for the publishers of the content. What I’m talking about here is mainly Classes and ID’s in HTML. Give it some time. Those things are basically Microformats waiting to happen. Right?
Last but not least, remember that the emergence of “Linked Data” will probably seem to explode at a certain point, even though the buzz seems to have slowed down in the echo chamber. There’s a great little analogy I came across where data are compared to buttons being threaded together from one to the next, randomly and one connection at a time. How many random single connections need to be made before picking up one button will bring all the others along? The results are reassuring. Check it out over at the Data Evolution Blog, the newest feed in my Feed-Reader.
If you haven’t played around with EtherPad, and you have a few friends you can get to screw around with you on this thing, do yourself a favor and try it out.
At first, it’s very simple:
EtherPad is a Collaborative Text-Editing environment. It’s real-time though, so it’s not as much like Google Docs (remember Writely?) as it is like IM. Yes, it’s like Instant Messaging only more instant. Every character typed or removed by anyone working on the text is seen in real-time by everyone else editing the document. The page never has to reload or anything! Ah, the beauty of Javascript.
Be warned though, this means that the people you’re working with can see how slow you type! And as of yet, there’s no spellcheck, so you’re basically letting it all hang out.
I heard about this from the Technometria Podcast, and it’s clear to me that, as they discussed in the show, for students taking notes during a lecture, nothing I’ve ever seen in my life could ever be as valuable as this technology is, even in its youngest form, that is, as long as the students in question have computers and friends.
Before I go any further, I should mention that my techie friends are all telling me about JQuery… I’m not a programmer, so that doesn’t mean anything to me (yet)… Also, EtherPad is only one of several spotlight applications running on a new platform called AppJet, which I guess is a Javascript-based development platform that’s really visual/browser-oriented. Maybe even a sort of WordPress for Ajax?
Well whatever. I’m not a dev so I’m not qualified to criticise that stuff, but the mention of JQuery seems timely given what I’ve been hearing, all-hype though, as far as I’m qualified to say, as a non-programmer. The use of Javascript in general, is not all-hype, my instincts tell me… We better move on because I don’t know shit about Javascript. But I do think it’s the future, if you’re asking my nose.
I would like to see EtherPad with TinyMCE because at the very least, UL’s and OL’s (un-ordered and ordered lists), Bold and Italics, Links Etc, would make the collaboration so much more useful!
Beyond that, I’d love to see an app that can be installed anywhere that allows people to run controlled instances of ET, while controlling certain parameters like the maximum number of characters or lines per document… Etc…
I have a lot of ideas about the possibilities of this kind of real-time text-editing. Big ideas.
I really believe that Google is trying to avoid becoming everyone’s scrape-able Semantic Query Engine. There’s tons of at least semi-semantic data out there and google simply doesn’t present it to us. They have it. They understand it. They could give it to us. But they don’t. I mean for crying out loud, imagine how difficult it must be for google to return image search results that are anywhere near as good as google’s image results are? Does anyone really think that google is completely ignoring microformats or service-wide presentational semantic data (an example of this would be the html classes and ID’s assigned to elements on social network pages)?? Does anyone really think so? While they’re looking at things like alt tags and nofollow tags and everything else? Would google just ignore piles and piles of metadata? No. Would they decide to not let us use it? I think so.
I think they’re doing a classic ‘roll-out’ thing, saving their best search technology for when they absolutely have to whip it out for competitive reasons. This is cause to resent google to a certain extent I think.
NOTE: This is an affiliate link so I do get paid if you sign up with BlueHost after following this link. *But* I only participate in their affiliate program because I actually do sincerely recommend their shared hosting product for most people.
Originally, I saw this on Vimeo, which is a pretty awesome alternative to YouTube. Video quality is usually way better anyways. I was gonna embed the vimeo version but WordPress is pretty limited with respect to embedded content. So here’s the youtube version.