hVia

Update: After thinking about this a bit more, ‘citeVia’ may be a better potential name.

For my Internet Systems Research class last night, we had Tantek Çelik come speak on microformats. I believe it was more or less the same presentation he gave at SxSW this year.

Everyone in the class seemed very interested and excited by these new developments and Eran and I were so excited that in our regular after-class-beer-drinking, we worked out an idea for a new microformat.

The problem we want to deal with are ‘Via links’ (also called Hat Tips). These are the links at the end of a blog entry that give credit to someone for providing the information to the blogger.

It seems obvious that this is a type of citation, so the <cite> tag would seem the appropriate choice. However, this is a special kind of citation which is well recognized among bloggers and and distinct from content citations.

Before describing a solution, let me pause to explain why we think this is a problem– first of all, there have been studies done to track the spread of ideas through the blogosphere. Unfortunately, Adar, et al, had to do a lot of work related to inferring connections where there were no supporting data (ie, when people didn’t put via links, they had to infer where they got the information). So, having parseable via-links would be a great benefit to those who study the blogosphere.

Secondly, via-links can be useful to bloggers and blog readers. We all care (or should) about people’s sources and via-links are a way for bloggers to be transparent and give credit where credit is due. The practice of using via-links will enable blog-readers to track ideas back to their source.

Thirdly, bloggers who attribute their sources will benefit by incentivizing their readers to send them pointers to interesting material.

So, given those reasons, we conclude that using via-links is a best practice for bloggers and that there is reason enough to have a machine-parseable semantic format for these citations.

Here’s our idea– we’ll use the <cite> tag with a special class. For example, original markup from BoingBoing:

<em>(Thanks, Dave Gill!)</em>

and in our format:

Thanks, <cite class="via">Dave Gill</cite>!

Of course, BoingBoing could use a CSS rule to style this element with italics, to have the same presentation style they have now. Also, not that in this case, there’s no URL for the person being cited. This leads us to the second part of our proposal– when citing someone as the source, you should use the most specific URL possible.

So, if the pointer comes from someone else’s blog entry, you should reference that entry. If the the info came in some other manner, but the person still has a personal website, you should link to that site. Of course, if the person doesn’t have a personal website, their name is the best we can do and is considered sufficient.

Now for some more robust examples of what we’re proposing:

First, from [photomatt](http://photomatt.net), an example of a non-specific web citation. Here’s his markup:

<cite>Hat tip: <a href="http://deadheart.net/">neiljmorrow</a> via email.</cite>

As you can see, he’s already using <cite>, so he’s ahead of most, but we think things can be improved. For the sake of clarity it seems more useful to put only the person’s name (and link) within the <cite> tag. So, with our proposal, Matt’s markup would become:

Hat tip: <cite class="via"><a href="http://deadheart.net/">neiljmorrow</a></cite> via email.

Matt may not think this situation is optimal, and he would have to do some more work to get things styled the same way he has them now, but we think keeping the <cite> element to just the name/link of the person is worth the tradeoff.

This example also brings up an issue we discussed last night– the issue of source types. As Matt does here, bloggers often indicate how the information reached them, whether via a blog, email, im or face-to-face. At this point we’ve decided to not try and encode that into the microformat, but the idea is open for future work.

Ok, so what I’ve presented here is a rough outline of what we think would be the best practice regarding a via-link microformat.

Please give any feedback you have, public or private, especially if you have a better idea for a name (the working name is ‘hVia,’ but is open to change). My email is ‘ryan’ at this domain.

11 Responses to “hVia”

  1. Bud Gibson Says:

    So, I agree on keeping the via link simple. The win in this seems to be showing the chain you went through to get the link. Whether it came by email or something else seems like some extra bookkeeping to me. People may not be so good at putting that in or even remembering it.

    The other thing I would note is that it would be great if you got tool support for this. I find it can take me a long time to put all of these links in, even the direct ones. I envisage this: a bookmarklet that allows you to select text on a page and creates the link with via based on referrer (if you so choose). I think that would be the workflow. You could probably achieve it with grease monkey on firefox.

    One spectulative question, would rel=via work here? I suppose the problem with that is that the rel indicates the relation of the link to the current page, not another link.

    A picky point, in the first part of the photomatt example, I’m not seeing the cite tags that you say Matt is already using. Are they properly escaped?

  2. ryan Says:

    Bud- thanks for the feedback.

    Re: tools, that will definitely be important for something like this to work. I’m already considering a WP plugin to help out with this.

    About ‘rel,’ – the thing is, a via-link seems a lot like a citation to me and therefore the cite tag seems appropriate. I may be wrong here, and that’s why I’m asking for feeback.

    About matt’s tags… my bad, escaping problems.

  3. Joseph Scott’s Blog » Blog Archive » Solving The Via Problem Says:

    […] links. What we really need is a way in markup to show the relationship. Ryan King has a few ideas on how to deal with this. He’s referring to this problem as “hVia&#8 […]

  4. Tantek Says:

    I really like the analysis you went through Ryan. I think where you ended up is close to an ideal solution. Wrapping a hyperlink with <cite> clearly communicates that the link itself is the citation, not just the contents of the link. If you just wanted the contents to be the citation then you could put the <cite> inside the <a href> instead.

    You added class=”via” to the <cite>, which makes sense as a way to distinguish such citations. But is there such a need? I mean, what else could a <cite> around an <a href> mean other than a “via” link? And if there is nothing else it could mean, then there is no reason to distinguish such citations with the class attribute (nor the rel attribute for that matter).

    Thus I think you may have found a very good case for a new “XHTML compound” (the combination of two or more already well defined XHTML elements to express a new precise semantic) rather than a new microformat. This is quite an accomplishment for a few reasons. First, XHTML compounds don’t require extending XHTML at all, and thus you are capturing semantics in a way that is already fully supported by the specification, and can be expected to be fully supported by more applications today. Second, by not inventing any new concepts/names/terms/schema, an XHTML compound saves you the time of having to discuss such new things, and avoids the risk of potentially getting them wrong and having to iterate. And third, due to the more greatly constrained set of semantic combinatorics, it is much harder to come up with a new and useful XHTML compound than it is to come up with a new microformat.

  5. ryan Says:

    Tantek-

    Thanks for the reply. I agree that an XHTML compound would be ideal, but I don’t think its possible in this case. Here’s why: to take an example you used in class:

    <p><cite>Eric Meyer</cite> wrote:</p>
    <blockquote….

    I would be tempted to write as this:

    <p><cite><a href=”http://www.meyerweb.com” rel=”met” >Eric Meyer</a></cite> wrote:</p>
    <blockquote…

    To me this seems like a different case. This is a <cite> surrounding an anchor tag, yet its not a via link. In this case it is just an indentification of the person we’re citing.

    Of course, I may be wrong and am willing to here arguments otherwise.

  6. Bud Gibson Says:

    Ryan:

    I agree with you on this last point as I fnd myself frequently doing the same thing with cite. I also tend to think the more disambiguating information the better as the real value-add of all of this comes when various services start to aggregate the individual contributions.

    Bud

  7. Brian Del Vecchio Says:

    Ryan, I have wanted a formal way of lnking back to sources, both in blog posts and in Delicious posts. Getting attribution right is a real sore point for me, and I think you’re off to a good start here.

    I use this kind of backlinks in two different ways. When there’s a specific post that’s relevant, I’ll link to the post. Otherwise I’ll link back to the root of the site, if a specific link isn’t available or appropriate.

    A via marker without a URL link back doesn’t seem all that useful to me, since there’s no way to disambiguate one Ryan King from another.

  8. Tantek Says:

    Ryan, the example you gave is why I distinguished between a <cite> around an <a href>, and an <a href> around a <cite> (re-read my comment). I would have marked up that example as the former if citing/quoting something Eric said in person, e.g.

    <a href="http://www.meyerweb.com"><cite>Eric Meyer</cite></a>

    But if citing/quoting something Eric said on meyerweb.com, then I would use the latter:

    <cite><a href="http://www.meyerweb.com">Eric Meyer</a></cite>

    which also expresses the exact meaning you are seeking to express with viaCite as far as I can tell, that you are quoting something you got from a particular website/blog.

  9. limbo Says:

    To clearly illustrate why we believe citeVia is necessary I want to talk a little bit about quoting in general. The following are my thoughts about citing and quoting, partially building on Tantek’s comments.

    There are several reasons and ways for citing a reference.

    A simple quote would usually be markuped as
    <cite>source</cite>
    <blockquote>quote</blockquote>

    If we want to give a link to the source of the quote use the cite property of the blockquote tag:
    <cite>author</cite>
    <blockquote cite=�?source url�?>quote</blockquote>

    To just reference a source or person without actually quoting we use the cite tag alone. The more common case for this would be referencing a source (academic paper, journal article, book, etc.) in which case we use the following forms of mark up:

    <cite>author</cite>
    <a href="source or web page"><cite>author</cite></a> to add a link.
    <cite><a href="source">author</a></cite>

    In the second case, the nesting means we’re quoting a part of a work by said author and the link should lead to further information about the author or this work. The third case means we’re citing the entire resource found at the link’s HREF attribute and this work was authored by the cited author. Note: this distinction comes directly from Tantek’s comments on the subject.

    With citeVia we wish to mark implicitly a specific sub-class of the above third case. Since via links carry specific importance and since there could be quotes of similar class (and hence similar markup) in the same document, we think that creating a specific class for via links is correct.

    Following is an example that, I hope, fully illustrates the need for specific via markup.

    Blogger A (http://a.com/foo.html)

    Foo.

    Blogger B (http://b.com/on_foo.html)

    <a href="http://a.com/"><cite>a<cite></a><blockquote cite="http://a.com/foo.html">…foo…</blockquote>

    Blogger C

    <cite><a href="http://a.com/foo.html">A</a></cite> says that foo

    is a interesting material
    .
    .
    .
    via: <cite class=�?via�?><a href="http://b.com/on_foo.html">B</a></cite>

  10. the ryan king » Blog Archive » citeVia Says:

    […] … textually active since for a little while now.

    « hVia

    citeVia

    Since I first wrote about […]

  11. Hellonline » Blog Archive » citeRel comments Says:

    […] XHTML�? and in discussions following that talk. You can read more about that on Ryan’s early posts about the microformat. CITE A stands for citing the linked source well enough […]