Recent Posts

Recent Comments

Archives

Pages

The ISBN as SKU

Posted: June 16th, 2008, by Peter Brantley

[mirrored at Peter Brantley's shimenawa blog]

I’ve spent the last few days in New York, and had the pleasure of meeting with various interesting folks. About which more anon, separately.

Many of the conversations revolved around digital books and the future of publishing — what form will books take? Would they be downloadable objects, or eventually migrate to a fully networked book? The consensus was that ultimately the book would live on the cloud, and as network access becomes ubiquitous, the implicit assumption that more and more of the content a reader will “license” or acquire will not be something that he has any direct physical ownership of, either in bits or paper. Maybe those options will cost extra; maybe they won’t be available. We will read our books on iPhones and Androids, via iBooks and Google Book Search; on Kindle v2 and the Amazon Book Shop.

This may have a profound impact on interpretations of the Fair Use privilege; generally licenses obviate the ability to assert Fair Use because non public domain network assets are usually governed in their use by contract. If, for example, Google Books settles with publishers in the AAP and AG suits, the ability to reclaim Fair Use will become sadly pivotal.

One of the other interesting casualties of this transition will be the existing book identifier schemes. Already, publishers are making a single EPUB digital book package, and then leaving the proliferation of more discrete ebook reader formats to intermediaries, distributors and wholesalers. Ingram will make the XYZ, Amazon will make the Kindle format, etc. The publisher is only responsible for one file, the .epub package.

This was a design goal of the IDPF, of which I am a board member. It relieves some of the work for publishers. What was entirely expected was that this leaves the publisher making one electronic product; what was not thought about as much was that this leaves the publisher with one ISBN for the digital book.

We are rapidly jerking forwards into a near term future where ISBNs will be assigned for derivative digital book products by intermediaries, not publishers. As an astute colleague observed in New York, the ISBN becomes a product SKU.

There are many disadvantages in this; one is that it will become increasingly difficult to find the “book” in the tangled weave of various digital instantiations. Perhaps no longer will we be able to ask how many copies did EduPunk 2020 sell.

And even this problem may be transitory. For as books move to the cloud, from digital bundles to network assets, we will not be counting “things sold” but link hits; not things shipped, but pages accessed. As some forward thinking publishers like O’Reilly have already demonstrated, the bookshelf will be not only virtual, but increasingly transitory in composition.

Whether we will be able to successfully rethink our conception of identifiers is a problem that lays beyond us just far enough that we are even uncertain what the contours may be.

11 Responses to The ISBN as SKU

  1. Michael Cairns

    A number of years ago Michael Healy and I were involved in the revision of the ISBN standard. No other point garnered more emotion than the requirement that each version of a title p or e have its own ISBN. Both Michael and I (and all of the people on the administration side of ISBN) felt this was necessary to maintain the integrity of the system and accord the same treatment to varying versions of an ebook as we would to a large print or paper version of a print book. Some publishers didn’t see this view; rather they saw the ‘administrative’ burden of applying many multiple ISBNs to a title as more than they could cope with.

    For me (and I think Michael H) the issue was more obvious. If different versions of an e-book title were not given ISBN’s by a publisher then entities in the supply chain would apply them for them. At Bowker we saw this happen in multiple instances. Why a publisher would allow a retailer or wholesaler to mess with their bib data made/makes no sense to me. The Publisher should manage and administer this information and allowing others to apply their own ISBNs (or SKUs) is not in the publisher’s interest. What happens is that each supply chain entity applies their own number and this just increases confusion and inefficiency. If publishers maintain the veracity of the standard then I don’t think it will be difficult to identify specific formats of books and the channel in which they were sold.

  2. Laura Dawson

    Maintaining the metadata associated with each one of these ISBNs is incredible. Say you’ve got a Spanish textbook. Then you’ve got the e-version of it in three different formats (VitalSource, Quia, and CourseSmart). Then you’ve got the print lab manual and the print workbook. Then you’ve got the e-versions of these. Plus you’ve got downloadable language labs that you can port into your iPod. You’ve got instructional animations that are available as a separate product that you can put on your iPod as well. That’s 14 related ISBNs with associated metadata attached to each one. Then if you need to customize any of these for a particular state university adoption…

    It’s rather like having a very large extended family and having to keep track of birthdays, anniversaries, graduations, who’s in trouble at school, who needs new clothes, what the babysitting schedule is, who is getting a divorce, and that your niece is learning-disabled and your brother has to sue the school district every year just to get her the services she needs….

    In other words the more ISBNs you have, the more crucial accurate metadata becomes, because otherwise how the hell are you going to keep your products straight? And I think this is in large part what publishers object to. They have a lot of frustration in keeping up with the metadata explosion that goes on with these splintered, fragmented, related products.

    The mistake is in thinking that fewer ISBNs means less data administration. Your products still need to be sold, even if we could come up with a different way of identifying them - just as your husband’s second cousin will still need a new baby gift even if you can’t for the life of you remember her name.

    No, really. The more products you get involved with, the more metadata you have to keep track of. And that’s just a cost of doing business.

  3. Mark Bide

    The challenge of maintaining metadata for media content is becoming increasingly obvious wherever you look.

    In my experience, looking at both publishing and the music industry, the problem is seriously exacerbated by looking at “digital” as an adjunct to “physical”. For example, when considering how to manage the relationships between content and products, and between products and related products, there is a natural tendency to start from legacy systems, particularly those systems we have used for years or even decades to manage metadata for physical content and products, and then try to adapt these to manage digital content and products. Unfortunately, so far as I have seen, such attempts are bound to fail (or at least bound to end in the sort of manual untangling that Linda describes which over time becomes increasingly unscalable). On the other hand, if you start from processes and systems designed for the management of metadata for digital, the addition of the management of metadata for physical is entirely straightforward. However, this kind of radical approach to systems development requires the sort of rethinking that is not easy to achieve (since it involves every part of the organisation).

    But, as Linda says, managing metadata and identity properly has already become a cost of doing business.

    Mark

  4. Michael Holdsworth

    Some of these issues are addressed in the paper I wrote [pdf] for BISG and BIC earlier this year.

    In one particular respect, that discussion paper has been overtaken by events - along the lines alluded to by Peter B.

    In January, ISBNs were simply not available (within the International ISBN Agency’s rules) for intermediaries and re-sellers to allocate. They couldn’t get a prefix. This forced some to take quite extreme measures: for example, creating “ISBN-like” 13-digit identifiers, or even appropriating pirated prefixes from other (often East European) ISBN jurisdictions.

    After the London Book Fair, however, the Agency agreed that national ISBN agencies (respectively Bowker and Nielsen in the USA and UK) may now legitimately assign registrant prefixes to eBook resellers to enable them to allocate ISBNs to individual eBook formats if, and only if, the publisher has not provided an eBook ISBN for each format, or if (illegally within the scheme) the publisher has used a composite identifier to cover more than one format. If the publisher has provided separate ISBNs for each separate format, then the Agency insists that these should always be used in preference to any reseller ISBNs.

    Publishers need to decide the nature of their eBook supply-chain relationship. Are they selling a product to a retailer to sell on, just like a print book, which is their (= the publisher’s) product all the way through to the user/consumer, and identifiable as such, with their brand, ISBN and their prefix?

    Or are they allowing the retailer to create any number of products from a source file (ePub in its ‘distribution’ model), which the publisher is simply licensing –the audible.com model - leaving it to the retailer/eBookseller to own and brand the product with their own ISBN and prefix?

    The latter relationship starts to look very much like a traditional ‘secondary right’, with the retailer taking on the role as secondary publisher (like a newspaper, an anthologist, or a foreign-language publisher). This has the potential to impact the current debate between publishers, agents and authors’ societies about appropriate eBook royalty structures. And impacting it in the direction of higher royalties.

    I’m not a publisher any more, but if I were, I’d be really nervous of third-parties deploying their own registrant prefixes and ISBNs for my content, but I’d be leaving my business partners with no alternative if I haven’t done the work of pre-allocating differentiated identifiers myself.

    To Laura’s point about systems-bloat, there are real concerns here. But I guess we fall back on the premise that computers are designed to manage complexity. They just need some clear rules…

    Michael Holdsworth

  5. Jim Lichtenberg

    It’s easier, and probably a reasonable first reaction, to think of digital identifiers in terms of the (complex) technology that is involved. But Michael’s point underscores the importance of publishers moving toward a BUSINESS approach to these questions, based on the assumption that your content is YOUR content.

    Granted paperbacks are are now so familiar, but publishers have dealt nicely with their evolving iterations after some grousing at the beginning…. so why not digital versions. In the ancient days of 1995-1996, one of the things that made the Internet scary was the generally held view that publishers were at serious risk of having some giant technology company swoop down and take away all the content in digital form…

    Best way to avoid falling into that deep river :-) is to manage the identifiers as Laura suggests as a cost of business, and respond to digital content as a business challenge, not a technological one as per M. Holdsworth.

  6. Mike Shatzkin

    As several commenters have made clear, how to manage the ISBNs IS a business question and publishers should not be making decisions based on how much admin is involved with one thing or another, but on the real business issues buried in this epub and ISBN discussion.

    What epub allows is a proliferation of formats because the conversion from epub to somewhat different ebook formats can be pretty easily automated by the format creator. One could imagine with the iPhone, for example, where Apple seems (at the moment) disinterested in ebooks but is encouraging independent development, that more than one format could well be developed, perhaps already has been. Somewhat paradoxically and counterintuitvely, then, the “standardization” to epub could lead to a proliferation of formats which might not themselves be interoperable.

    And, aside from not being interoperable, some of these versions might work a lot better than others. So among the business questions for the publishers are:

    1. Are you happy with your epub file being turned into any format a downstream vendor can successfully peddle, without feeling the need for any sort of vetting or quality control?

    2. Will you actually resist tracking sales and revenue reporting by the various formats and, in your own records and reporting to the author, just leave them rolled up by master (epub) ISBN?

    3. How far does your laissez-faire attitude go, if a downstream epub vendor were to “enhance” your publication with additional material, criticism, notes, etc. assuming that your price-per-copy-sold were paid to you in full?

    4. Time was, if the book you got was missing a signature or had a page printed upside down, you knew that the publisher was the responsible party and would address your complaint to the publisher. What does the consumer do if the ebook doesn’t work and the publisher’s involvement ended with the file handoff? Should a publisher accept that situation?

    The question of whether an epub-only ISBN unconsciously crosses a line from selling an instance or an object to licensing content for use may become critical if Amazon decides that licensing the content and delivering their own versions — like a big book club — would be their preferred way to do business. And how far away could that day be?

    Mike

  7. Nick Bogaty

    Leaving aside the subject of ISBN assignment as it relates to EPUB, I’d like to address some issues which Mike brings up below. I think the situation is a bit less dramatic than described:

    As I’ve previously detailed on this list, when we developed EPUB, we imagined its use in one of two ways. The first was software which natively rendered (read) EPUB files like Adobe Digital Editions does. The second was software which automatically converted from EPUB to a proprietary file format. The criteria of success we always casually referred to in the second scenario was that the software would convert EPUB in an “un-stupid” way. What this really meant was that the layout and composition of the original EPUB file would be maintained in the automatic conversion process, leaving a publisher’s book still looking good after conversion. Just like if you converted from MP3 to AAC in iTunes, the file would still sound good after conversion.

    So, a publisher does and will continue to vet their EPUB files, only now on a software basis and not on a file by file basis. For example, if a software company comes to a publisher and says, “I can take your EPUB files and automatically convert them to XYZ format and your books will still look good,” the publisher just needs to be sure that this statement is true. A publisher can make sure this is true by taking 10 or 20 or a 100 files and open them in the software and see what their books look like. At the end of the test, if everything converts well, no additional vetting would be required because the software company passed the test of converting “un-stupidly”. This isn’t a publisher giving up vetting, QC and responsibility for a well produced product, it is just a more efficient production workflow following an understanding by a publisher that a company’s software works as expected.

    I’ll add though that an advantage of software that renders EPUB natively is that this vetting and QC step to insure that a file is converted correctly does not have to happen. The software renders the EPUB file as the publisher created it.

    Finally, I do not believe that EPUB necessarily allows the proliferation of file formats; that certainly wasn’t our intention when we developed the format. In only very unusual circumstances that I know of does software or hardware require conversion to a proprietary format because EPUB does not, for some reason, provide the necessary technology. Proprietary formats simply exist now because EPUB wasn’t around for the past 8 years and software companies needed inventory from publishers to do business so they made their own. With all of the recent announcements by publishers on releasing EPUB inventory and a critical mass of EPUB selection coming to the market (I’d guess by this Fall) the marks in the “pro” column for software supporting a proprietary format, seem to me, to dwindle.

  8. Grace Agnew

    I think the issue of relationship among different entities becomes more and more critical. The identifier can be “dumb” and used to validate authenticity of an object and ensure appropriate access, e.g., if I purchase the license to the Kindle version, I need an identifier that ensures I download Kindle and not a different e-book format, regardless of the distributor or download source. The dumb identifier works if relationships among the various formats and versions exist, perhaps using OAI-ORE, to enable me to select the most appropriate version for me (e.g., the Kindle) so that the identifier associated with the Kindle can then ensure that I retrieve and use the appropriate resource–the resource I most likely have purchased the rights to.

  9. Kassia Krozser

    For many years, I worked at a motion picture company that managed a large number of smaller products associated with a single “meta” product (think fifty years of a daily soap opera to understand the scope). I often wonder why the ISBN process doesn’t mirror this more closely: the “product” (title/author/publisher[s]), the format, and distribution channel. This keeps a single number associated with the actual title/author, but uses identifiers to isolate specific changeable aspects related to the book.

    I’m not saying that the system used by my former employer was perfect — in fact, the entire notion of SKU destroyed the lovely symmetry as unaffiliated third parties had their own numbering systems — but it sure beats separate and unique numbers for units that are part of a whole.

  10. John Mark Ockerbloom

    “When we developed EPUB, we imagined its use in one of two ways. The first was software which natively rendered (read) EPUB files like Adobe Digital Editions does. The second was software which automatically converted from EPUB to a proprietary file format.”

    How is this different from OEBPF, which was first released back in 2000, and which was also imagined to be used both of these ways? The reason I ask is because that format, in my recollection, didn’t actually get used much as a native-reader format in many commercial ebook releases, but seemed to mainly be talked about as a “publisher intermediate format” after it was released.

    “Proprietary formats simply exist now because EPUB wasn’t around for the past 8 years and software companies needed inventory from publishers to do business so they made their own.”

    This appears to skate over the whole DRM issue. Are you expecting that this time most publishers will be content to release straight Epub (which I didn’t think had DRM, but correct me if I’m wrong), instead of putting it in their favorite DRM wrapper? If so, what makes you think that? (I’d like that, but I’m not sure I see definite signs of it happening.) If not, why would Epub do anything about the existing proliferation of proprietary formats for commercial ebooks?

  11. Imma Wildcard

    There’s one real easy and quick fix to the multiple copy problem, and one that would require only a slight modification to computer programming that both creates and checks a barcode. AND one that would save publishers money and headaches.

    The checksum number offers the possibility of 9 variations if it is discarded as a checking device (a rather quaint use, in some ways) and used instead as an ID number. Thus, one ISBN can now ID nine variations of the same book.

    Checking the ISBN for error during entry? Well, if the title on the book doesn’t match the catalog, we know something’s wrong, right? And if it is being read by a machine, just how common are errors today — I would bet very low. I’d prefer to double-check my numbers as I enter them and have an occasional glitch in inventory and have an instant nine possible variations on one ISBN.

    And for books sold only through Amazon.com or the like, the need to be scanned without error is almost non-existent. For a small publisher, these “bootleg ISBNs” make perfect sense (for the reason I’m about to show below).

    There are other advantages: I can instantly call up ALL the variations of a title if I know just one ISBN whethr for the print, PDF, Kindle, talking book, or whatever version I might have in hand. I can just enter the ISBN for one version and a wild card in the checksum (now ID) space, I can see all the versions in one nice column.

    Try finding the PDF, MP3, Kindle, Hardcover, softcover, etc., etc., with one ISBN number search today and you’re going to have problems. Chances are you won’t discover them all, especially if this or that company has created a version of their own and tacked a new SKU to it (which as someone else noted, causes loss of control for the publisher, yet is done because of the expense of buying yet one more ISBN for the same title).

    Of course this change in how the checksum is used would make it so those institutions issuing ISBNs can’t gouge little publishers for 9 numbers when one would do the trick, and do it better for most of us. So those working for big publishing companies should be sure to avoid even considering such a change less small presses have a level playing field. Just saying ;o)

Leave a Reply