Archive

You are currently browsing the archives for the Blog category.

Dec

18

The Most Important Tip When Launching Your eBook on a New Site

By Mark Gladding

Successful authors connect with their readers through a blog, website, myspace or facebook page, twitter, etc, etc. Over time they will have built up a loyal following of fans.

When you release a new ebook, it’s very easy to contact this existing fan base. However you will also want to reach new readers and hopefully convert them into loyal fans. One of the best ways of doing this is to announce your ebook on sites like this and publish your ebook on sites like Smashwords.com. This will put your ebook in front of new readers.

The problem is, how do you grab their attention? What makes your ebook stand out from the crowd?

The answer is genuine reviews from loyal readers. As soon as you launch your ebook on a new site, you need to mobilize your fans and let them know your ebook has appeared on the new site. Readers love to share a good book with others and introduce them to their favourite authors. You won’t need to explicitly ask your fans to leave a review on the new site. They will do it spontaneously.

Therefore whenever you launch your ebook on a new site, make sure you blog and tweet about it.

Like everything, this needs to be done in moderation. If you announce your ebook on hundreds of sites all over the internet and then expect your fans to leave a review on every one, they’ll very quickly get tired of it. Therefore choose a few, reputable sites that are likely to connect you with the most readers.

This tip is very similar to the practice of coordinating a large number of readers to purchase your book from Amazon on a specific day in order to promote your book up the Amazon sales charts.

Good luck!

Dec

15

The Top 10 DRM-free eBooks for Christmas ‘09

By Mark Gladding

Here are 10 of the best ebooks announced this year on eBooks Just Published. They have been chosen based on user reviews, star ratings, and uniqueness. Price was not considered compiling this list - it just so happens that 8 out 10 are free. Use this list to find some interesting reads for the holiday season.

Beasts of New York by Jon Evans. (Fantasy, Free) Eaglethorpe Buxton and the Elven Princess by Wesley Allison. (Comedy, Fantasy, Fiction, Free) Giggling Into the Pillow by Chris Bridges. (Comedy, Erotic, Free)

Ghost Of The Black: A 'Verse Full Of Scum by Alan Baxter. (Fantasy, Fiction, Science Fiction, Thriller, Free) Hero Wanted by Dan McGirt (Fantasy, Young Adult, Free) Hal Spacejock 2: Second Course by Simon Haynes (Comedy, Sci-Fi, $5) 

Tokyo Zero by Marc Horne. (General Fiction, Free) Lockpick Pornography by Joey Comeau. (General Fiction, Free) Songs from the Other Side of the Wall by Dan Holloway. (General Fiction, Young Adult, Free)

Thin Blood

Dec

11

Should I Buy an e-Reader for Christmas?

By Mark Gladding

Nook, Kindle and Sony e-Readers (from left to right)

We’ll probably look back on 2009 as the tipping point for ebooks. eBook sales have experienced phenomenal growth this year, a swag of new ebook readers have been released, including the Kindle 2 and the recently released Nook from B&N. In addition to the dedicated ebook readers, netbooks and the iPod touch and iPhone have become very capable and popular e-reading devices. There’s also been an explosion in the number of ebook sources, ranging from offerings by independent authors and publishers to those from major, traditional publishers who are finally making their bestsellers available in ebook format.

Despite this rapid progress things still have a long way to go before ebooks replace print to the same extent that digital music players have replaced CD players.

So is it a good time to purchase an e-reader? Yes, but only if you’re prepared to upgrade in 18 months or be stuck with a seriously outdated device. The competition in e-reading devices is hot and all manner of new devices are appearing on the market. Consumers have more choice and the strong competition is driving innovation.

Unfortunately, offsetting this positive innovation is the sad fact ebooks are still be sold wrapped in proprietary DRM. This means there is no guarantee the ebooks you buy today for your state of the art Kindle 2 will be able to be read on the next whizz-band e-reader from Sony or company XYZ.

Not only will you be stuck with an inferior e-reader, you won’t be able to move your library to a new device.

Thankfully not all ebooks are encumbered with DRM, thanks to publishing platforms such as Smashwords.com and other independent DRM-free publishers, so wherever possible you should buy DRM-free ebooks. DRM-free ebooks (especially those ebooks in the open ePub format) have the huge advantage of being able to be read on any e-reader, not just those available today but those to come in the future. It’s a bit like buying chicken at the supermarket - you should always try to buy free-range as it’s more ethical and better for you long term. However if you’re starving and there’s nothing else on offer,  that battery farmed chicken sure looks tempting.

At some point I hope there will come a time when you’ll be able to guarantee that the ebook you buy today will be able to be read on all e-reading devices and the e-reader itself will have evolved to a point of refinement where they have become the norm for reading rather than the exception.

Until that time comes, you have a number of options. If you don’t mind reading from an LCD screen, then using an existing general purpose device such as a netbook or an iPhone/iPod makes good sense. Many actually prefer this reading experience to that provided by the e-ink displays of dedicated e-readers.

Another option is to convert your ebooks to audiobooks and listen to them on your iPod or MP3 player. This option has the advantage that you can listen to audiobooks at times when it’s impractical to read, like when you’re driving meaning you can pack a bit more reading time (or a lot if you have a long commute) into your day.

Finally, print is still a good option, unless you need to move house like we did last month. Then you’ll be amazed at how few books you can pack into a box before it becomes too heavy to lift and realize that advances in ebook technology cannot come quick enough!

Nov

18

New Audiobook Link Lets Readers Listen to eBooks on the Go

By Mark Gladding

Text2Go converts ebooks to audiobooks
New ebook announcements on this site will now include an audiobook link wherever possible. e.g.

 Convert to Audiobook

When this link is clicked, the ebook will be downloaded to your PC and converted to an audiobook using text to speech. The text to speech conversion is done using the program Text2Go. If Text2Go is not already installed, the reader will be prompted to do this first. As Text2Go is a Windows only application, this will only be available to readers with PCs.

Once the ebook has been converted to an audiobook, the reader can listen to it on an iPod, iPhone or any other portable device capable of playing MP3 files.

The following tutorial on the Text2Go website explains the process -

Converting an eBook to an Audiobook Tutorial

Why would you want to listen to an audiobook?

The advantage of an audiobook is it can be listened to during the day when reading is impractical or impossible. For example, you can easily listen to an audiobook while driving to work, going on holiday, out walking or in the gym.

An audiobook can be listened to while doing some other mindless activity, so It feels like you’re getting something for free. This is especially true when you find yourself stuck in traffic. It’s a great way of negating the frustration that threatens to overwhelm you in such situations.

Finally, the audiobook format is ideal for the visually impaired.

Details

  • There must be an ePub version of the original ebook available.
  • If the ebook being announced isn’t free, the audiobook link will link to a free sample where possible.
  • If the ebook already has a companion audiobook or podcast, this will be linked to instead.
  • Text2Go is a commercial application that can be used for freely for 30 days. After that time, the reader must purchase Text2Go ($US25 or $US45 including a high quality voice) if they wish to continue converting ebooks to audiobooks.
  • I will be adding an audiobook link to existing announcements when time permits.

Nov

14

Are in-store cafes eating into author profits?

By Mark Gladding

Recently I’ve become a regular at the Collins St Dymocks bookstore in Melbourne during my lunchbreak. I like to browse the new releases and then grab a coffee or occasionally a bite to eat at the in-store cafe.

The other day I realised that although I’m a regular visitor to the bookstore, I haven’t bought a book for a while. Instead I’ve spent the equivalent of several bestsellers on food and coffee over the last few months.

Do authors realise that this is where their money is being diverted? Should bookstores offer a loyalty card that gives you a free book after every 20 coffees purchased?

Is this another advantage ebooks have over print - there’s no danger of buying a virtual coffee while browsing an online bookstore.

Food for thought.

Sorry, I couldn’t resist just one more cheesy pun.

Nov

8

Text2Go Converts DAISY ebooks to Audiobooks

By Mark Gladding

In a previous post, I asked ‘Where are all the DAISY ebooks hiding?‘ I was looking for some sample ebooks in the DAISY DTBook text only format to test Text2Go’s ebook to audiobook conversion. At the time I could only find one sample ebook in the DAISY DTBook format. I needed a wider selection of ebooks to be confident Text2Go could handle all the various elements of the DTBook format.

Thankfully, Varju Luceno, Communications and Marketing Specialist of the DAISY Consortium provided an extensive list of links to various DTBook sources and Paul Biba over at TeleRead.org also re-posted my plea which netted me a couple more sources.

This has allowed me to dramtically improve the DAISY DTBook support in Text2Go and I’ve just released a new version of the Text2Go 4 beta. I still don’t believe it’s perfect, so I’m offering anyone who finds a bug in Text2Go’s DAISY DTBook support a free Text2Go license. Just use the Support command built into Text2Go to send me an email describing the problem. If I can reproduce the problem I’ll send you a free license.

Note: There are a couple of types of DAISY DTBooks.

  1. Many DAISY ebooks already come with an audio track that’s synced to the text. There is no point in using Text2Go to generate speech for these books as they already contain an audio track, often narrated by a real person, rather than synthesized speech.
  2. There are text-only DAISY ebooks. These require a specialized reading device or a software application like Text2Go to turn the text into speech. It’s this type of DAISY ebook I want to support.

Oct

29

Amazon Patents Digital Watermarking Technique for Excepts

By Mark Gladding

Slashdot has an interesting post on a patent filed by Amazon that describes a technique for digitally watermarking passages of text. It does this by substituting one or more key words within the text with synonyms for those words. These synonyms are stored in a database and each key word may have more than one synonym. Using this technique it’s possible to deliver a unique version of the text to each requester. Not only that, the specific combination of synonyms used can be stored in a database against the requester’s details. If that particular except is ever misused or illegally distributed it will be possible to track it back to the original requester. The beauty of this technique is the reader is none the wiser that the text has been modified from the original. It’s the textual equivalent to watermarking a digital image by subtly adjusting the colour levels of a small proportion of their pixels.

The main purpose of the patent seems to be uniquely identifying excerpts of text from a copyrighted work that is served digitally to a number of readers. It also specifically describes how synonym substitution would make it difficult to automatically reconstitute an entire document by successively requesting adjacent excerpts. The patent states that if the overlapping region of two adjacent excerpts was populated with different synonyms then this would confuse the program trying to stitch the two back together. I seriously doubt this would work in practice. The number of synonym substitutions required to confuse a program with even some basic smarts would make the original work unrecognisable. You’ve only got to look at how well photo stitching software can line up two images to know that matching two excerpts of adjacent text is going to be a trivial task, with or without synonym substitution.

What is exciting to me about this technology is it could be used as a very effective and unobtrusive form of social DRM for ebooks. Because the identifying information is hidden within the text itself, the text can be packaged in an open format such as ePub or HTML. The reader is free to store it on any device, print it, use text to speech to speak it aloud and save it to any private storage medium. Because it contains unique information that identifies the reader they are discouraged from sharing it with others.

There are however a couple of downsides to this technique. The first thing I thought when reading about the technique was ‘An author’s choice of words is sacred. I don’t want to read a book that’s been tampered with, no matter how subtly’. After thinking about it some more, I believe I could live with it for most titles. The immediate exception that springs to mind would be something like Shakespeare. Not only is his prose so clever and precise, it’s so well known and widely quoted, that to tamper with it would be an abomination. The longer the work, the better the technique would work as there is more information in which to hide your identifying information. In a large work, only a very tiny fraction of words would need to be changed.

The other downside to this technique is it seems like it would be very easy to circumvent. To identify the words that have been substituted with synonyms, you would just need to download two copies from separate accounts and use a textual diff to see which words have changed. No doubt only a small percentage of the entire set of substituted words would vary between two copies. However you could imagine a group of people getting together to download and compare a large number of copies. Once you know the majority of words that have been substituted, you could substitute your own synonyms, obliterating the identity of the original requester(s) and perhaps inadvertently assuming the identity of some other luckless customer.

Nevertheless it’s an interesting idea that I hadn’t heard of before. I’d be interested to hear what authors have to say about having their work modified.

Oct

29

Where are all the DAISY ebooks hiding?

By Mark Gladding

DAISY is an XML-based e-book format created by the DAISY international consortium of libraries for people with print disabilities. DAISY implementations have focused on two main types: audio ebooks (digital talking books) and text ebooks. DAISY text ebooks are similar in many ways to the ePub format. DAISY uses the DTBook XML document type which provides a rich set of tags for marking up various elements of a book, making it easy to navigate and accurately convert to spoken audio using text to speech.

I’ve been working on ebook to audiobook conversion for the next release of Text2Go,which is now in beta. I’ve provided support  for ePub and was hoping to include support for DAISY DTBook. The DAISY specification is freely available and there is a sample ebook in DTBook format. I’ve created a simple DTBook  reader which will read the sample DTBook available. However I need to test this with a large range of DTBooks from multiple sources before I can be confident that I’ve provided a bullet-proof implementation.

This is where I’ve run into problems. I just can’t seem to find a good source of ebooks in DTBook format. Are there free or even paid sources of such ebooks on the Internet or are they only available through libraries or sites catering to the visually impaired? Perhaps I haven’t hit on the right keywords to use in Google? It’s a real shame as I would like to provide first class support for the DAISY DTBook format as it’s been designed specifically for text to speech applications.

If you’ve discovered any good sources of DAISY ebooks, please let me know. Thanks in advance.

Oct

16

ePub, DRM and Text to Speech

By Mark Gladding

It was interesting to read over at TeleRead.org that the Los Angeles Public Library won’t buy e-books in a format for Adobe Digital Editions until ADE software supports text to speech, according to Library Journal.

This is a good decision by the LA Public Library as there are a number of thorny issues surrounding ePub, DRM and Text to Speech.

The first is that DRM protection and text to speech do not sit well together. Why? Because as soon as you offer text to speech you introduce a major security vulnerability into your elaborate DRM mechanism. This is why I believe that many ebooks in PDF format are shipped with text to speech disabled. Granted, in a number of cases, the publisher may not have the audio rights to the work but I suspect the majority of the time they don’t want to subject their works to this vulnerability.

To understand the security vulnerability you have to understand a little of how the text to speech process works. The following is specific to Microsoft Windows. I’m not familiar with text to speech on Linux or MacOS but I assume they have similar mechanisms.

Microsoft Windows supports text to speech using SAPI, the Speech Application Programming Interface. This interface serves two functions. It allows any Windows application like Adobe Reader, Microsoft Excel, and my own Text2Go to pass a string of text to the API and have it converted to speech. This speech can be output directly through the PC’s speakers or saved to an audio file (in .wav or mp3 format for example) for later playback. The actual conversion is done by a computerized voice. Windows XP ships with the atrocious sounding Microsoft Sam voice. Windows Vista and Windows 7 ship with the marginally better Microsoft Anna voice. Thankfully there are a number of 3rd party voice providers who sell a huge range of high quality, natural sounding voices in multiple languages and accents. Voices can be registered through SAPI and made available to any application that wants to provide text to speech functionality. Applications can then use SAPI to discover which voices are available on their system and let the user to choose a voice to use.

This brings us to the security vulnerability introduced by text to speech. In order for an ebook to be converted to speech, the entire text must be passed through one of the installed voices. For a normal voice this is not really a problem. The text will be spoken aloud through your speakers. But what say we created our own voice that didn’t convert the text to speech but instead saved it to a file. This would give you the means of instantly creating a plain text copy of an ebook. The only downside would be you’d lose all formatting information.

Such a voice would be very easy to develop. Microsoft even provide a sample voice as part of their documentation. Applications such as Adobe Reader or Adobe Digital Editions would have no way of knowing if your voice was a genuine text to speech voice or a text to text file voice.

The only way to guard against this would be for Microsoft to introduce a certification process for all SAPI-compliant voices. Voice vendors would be required to submit their voices to Microsoft for validation. Once verified, the voices would be digitally signed to identify them as being certified and to ensure they were not later tampered with. Applications could then choose to only use these certified voices for text to speech.

If you’re thinking this is a little far fetched then you may like to know that this is precisely the process Microsoft requires Vista-compatible video drivers to go through. This was to prevent the user from installing a video driver onto their system that pipes the video output from a DRM-protected HD-DVD or Blu-Ray disc directly to an unencrypted file. Peter Gutmann of the University of Auckland has conducted an interesting analysis of the Microsoft Vista DRM.

Closed, proprietary systems that don’t allow you to install your own software, such as the Amazon Kindle, will be less vulnerable to this approach.

As ebooks gain in popularity it will be interesting to see if Microsoft introduce a validation system for text to speech voices. In the meantime I’m sure publishers will continue to demand control over whether their works support text to speech on a title by title basis.

Which brings us back to the lack of text to speech in Adobe Digital Editions. Even if Adobe do add text to speech support, it’s still not much use to readers if publishers persist in disabling text to speech for the majority of their titles. The LA Public Library need to insist that not only does ADE support text to speech but all supplied ebooks also have text to speech enabled.

This confusing state of affairs makes it difficult for the ebook purchaser to answer the simple question ‘Can I convert this ebook to speech?’ prior to purchase. One of the great benefits of the ePub format is that you know there are no hidden restrictions on what you can do with your ebook. It can be read on any device with an ePub compatible reader and text to speech will always be possible. However now there are ‘DRM-protected ePub ebooks’ around, once again the consumer is left needing to ask a hundred questions to determine their rights for each individual title.

The real deception here is continuing to call ‘DRM-protected ePub’ ebooks ePub. As soon as an ePub ebook is wrapped in DRM it loses all the advantages of an open standard that come with ePub. I’m sure publishers recognise that ‘DRM-protected ePub’ is quite a mouthful and the tendency will be to shorten it to ePub.

I find this muddying of terminology particularly frustrating as I near the release date for a major upgrade to Text2Go which will support converting ePub ebooks to audiobooks. I would like to be able to say in my marketing material that ‘Text2Go supports ePub ebooks’ without having to add a caveat such as ‘except those protected by DRM’. A statement such as this means nothing to people outside the industry and all of a sudden you’re having a technical discussion on ebook formats, DRM, its restrictions and why is it necessary. By the time you’ve finished, if the customer hasn’t fallen asleep or fled, they’re going to be highly confused or suspicious of ebooks.

To my mind once an ePub ebook is wrapped in DRM it should not be allowed to use the name ePub. Perhaps instead they could be referred to as eSnub - the  format publishers use when snubbing the rights of readers and the format readers should snub if they know what’s good for them.

 —

For those interested in participating in the Text2Go ebook to audiobook beta, drop an email to markgladding at ebooksjustpublished.com and I’ll send you a prerelease copy as soon as it’s available.

Sep

30

Smashwords Outage Last Weekend

By Mark Gladding

Dan Holloway, author of the highly rated Songs from the Other Side of the Wall informed me that Smashwords had experienced an outage for several hours over the weekend. He was concerned the timing was particularly bad as it coincided with the time our weekly email digest of ebook releases gets sent out.

I asked Smashword’s founder, Mark Coker what had happened. Here is his response.

We had two separate outages Sunday, each caused by the same problem, though it took the second outage for us to verify the exact cause (a single author’s attempt to publish a bad file caused Meatgrinder to go haywire, causing a crash on each of her two upload attempts). The author was probably not as badly affected as they think. Despite being down for a just over 50% of the day’s hours, the site’s traffic dropped only 18% from the previous day, according to Google Analytics. I think the impact was mitigated because half the downtime occurred while the US was sleeping, and by around 2pm Pacific the problem was resolved.

Smashwords has always been a very reliable service. This is the first time I’ve ever heard of any serious outage.

On the positive side, Smashwords have just announced a distribution agreement with Sony, allowing Smashword authors to publish their works at the Sony eBook store. This comes just 4 weeks after a similar agreement was reached with Barnes and Noble.

Aug

22

Timely eBook Starring Squirrels

By Mark Gladding

With squirrel mania currently sweeping the Internet, what better time to announce a free novel by award-winning author Jon Evans. Beasts of New York is an urban fantasy about the wildlife of New York City, starring a squirrel protagonist who has to find his way from exile back to his home in Central Park, rescue his mother, and win a war.

If you haven’t heard about the squirrel that photobombed a Canadian couple’s holiday pic, you can read about it in the Examiner.  Then use the Squirrelizer to add a cute squirrel to any pic on the Internet.

Aug

7

Finally, an eBook Trailer that doesn’t suck!

By Mark Gladding

An amateurish ebook video trailer does more harm than good. Most of the ones I’ve seen lately send me scrambling for the back button in the first 5 seconds. Here’s one from Seth Godin for his free Tribes Q & A ebook that I enjoyed so much I watched it to the end - a first!

Please let me know if you’ve seen any other good examples lately.

Jun

17

Creating an ePub Reader for Text to Speech Use

By Mark Gladding

Recently I’ve been working on an ePub reader prototype. Once I’ve created a robust ePub ebook reader, I’m going to move this functionality into my text to speech application, Text2Go. My goal is provide a system that will convert an ebook to speech and transfer it to your iPod in a single click. This will allow any ePub formatted ebook to be turned into an audio book which can then be listened to while driving, walking, working out at the gym or any other activity where reading is impractical.

The focus of my ePub reader is quite different from the norm due to the fact that the recipient is not a human reader but a machine reader or computerized voice. A computerized voice cares nothing for fancy layouts, font selection or images. This makes my job a lot easier in many ways. However a computerized voice lacks one important skill a human reader uses frequently, often at a subconscious level. A computerized voice has no way of skimming over a section of text. For example, human readers will never read the same footer at the bottom of every page or meticulously read every page number. If this text is mixed in with the actual body of the story (usually as a result of some blind conversion process from a different ebook format to ePub), then the computerized voice will read this text in full on every page. This becomes incredibly irritating for the human listener.

The ePub standard provides direct support for structured documents that include footnotes, sidebars, annotations, page numbers, etc. This is achieved using an alternative xml document format know as DTBook. The ePub standard recommends this format be used for educational publications and publications that are highly structured - for example when it’s important that the page layout of the original printed document is maintained. DTBook actually stands for Digital Talking Book and is a standard developed by the Daisy Consortium for blind, visually-impaired, physically handicapped, learning-disabled or otherwise print-disabled readers. Although originally designed for talking books, the extra information (i.e. the metadata) the DTBook format contains about the ebook text makes it a great choice for any ebook and will greatly assist in applications such as my text to speech application.

Which brings me to the fact that not all ePub titles are created equal. Those that have been lovingly hand-crafted by someone who understands the ePub format will never suffer these problems. Those that have been blindly converted using an automated tool from a source document format that doesn’t make any distinction between the various roles of text within a document will be plagued by such problems. This seems similar to differences in quality between different editions of a print title. Unfortunately you can’t heft an ebook, feel the quality of the paper between your fingers, flex the binding or quickly thumb through the pages of an ebook prior to purchase. The ability to view a sample goes a long way to solving this but it would also be worthwhile for reviewers to comment on how well the ePub book has been put together, what underlying format is used to represent the text and does it have a table of contents to make navigation easy.

My ePub reader has three tasks to complete in decreasing order of importance.

  1. Extract the text from document, converting it from html to plain text ready for text to speech conversion.
  2. Organize the content into chapters. Each chapter will become its own audio track. This will make it easier to navigate than a single huge audio track.
  3. Extract the cover image and use it as the album art for each audio track.

Although all ePub documents conform to a standard, there is a lot of variability possible within the standard. In order to make my reader as robust as possible, I’ve been trying to find ePub titles from a wide range of sources. I’m particularly interested in those that have been created with different tools or even hand-crafted.

ePub Reader Prototype

An edition of White Fang by Jack London illustrates this variability nicely. It had a couple of unique features I needed to handle. Firstly it didn’t have a standard ePub table of contents. All it contained  was a single html file. However this html file had its own table of contents embedded at the top of the file.  It was implemented as an html table, with each entry containing a link to the relevant section further down in the document. Those familiar with html markup will know that you can use the anchor tag to name a specific point in a document. You can then link to this named anchor so your browser (or reader) will position you at this exact point in the document when the link is clicked. You can use this technique when linking to an external document or within the same document.

I had noticed the ePub authoring tool, Calibre, uses named anchor points when creating links for its table of contents. However because each chapter was stored in a separate file (within the ePub file, which is actually just a zip archive), the anchor points were a bit superfluous - each chapter just had a single anchor point at the top of the page. Although this was how Calibre organised its chapters, I imagined that it was quite possible to store an entire book in a single html file and use named anchor points to link to the relevant sections from the table of contents.

Therefore I added the ability for my ePub reader to split a book into sections based on where the named anchor points lay in the document. This had an unexpected benefit when I came to read the White Fang title. Although it didn’t contain a standard ePub table of contents, my reader was able to find the named anchor points it had used to implement its own in-page table of contents and correctly split it into chapters.

The other interesting feature of the White Fang title also centred around the table of contents. As I said before, this was implemented using an html table. However when my reader extracted the text from the table, all the text was run together. This would have been disastrous when it came time to convert it to speech. Unlike human readers, computerized voices don’t recognise an unusually long word as being a number of words run together.

When extracting text from a table, I needed to understand that a table cell acts as natural boundary for text and should be punctuated accordingly.

What is clear is that it’s really beneficial to gather as many ePub format ebooks from as many varied sources as possible to test my reader with. You can imagine my delight then when the new ePub ebook site, ePubBooks.com was announced recently on Teleread.org. Here is another source of ePub books, generated using their own tool. From the sample of titles I’ve downloaded, these seem to be very well formatted. My young and naive ePub reader had no trouble loading and splitting them into chapters. For interests sake, I downloaded their version of White Fang. This version had an ePub table of contents. The end result was the same the first version I had found. My only complaint with the titles at ePubBooks.org is they have no cover image. This obviously doesn’t detract from the reading experience but it does make organising and browsing through your library of ebooks a lot less fun if they don’t have cover images.

Finally if anyone knows of other ePub sources or have ePub format ebooks that have been hand-crafted or created with other tools, then I’d love the chance to run them through my ePub reader. Please drop me a line at markgladding at ebooks just published dot com.

May

26

Why ePub is the ebook format of the future

By Mark Gladding

Not an official ePub logo :(For a long time I’ve wanted a painless way of converting ebooks to speech in my text to speech application, Text2Go. Up until recently I’ve thought the best way to achieve this would be to support the PDF standard. Although PDF is ubiquitous and a great way to distribute documents that will ultimately be printed, due to its internal structure, it’s actually quite difficult to reliably extract text from a PDF document. There are a number of software libraries that can be purchased to perform this task. However these are either unreliable, overly complex, exorbitantly priced or a combination of all three. This situation has been brought about by a combination of factors, such as PDF being a proprietary format, being binary rather than text based, and being designed primarily to accurately represent a printed document.

These problems have meant that I’ve deferred adding automatic ebook conversion to Text2Go.

The ePub format changes all this as it’s the complete opposite to PDF in a number of important ways.

Firstly, ePub is an open format, not controlled or owned by any one company. This means that anyone can download the ePub specification (actually a series of standards known as OPF and maintained at the IDPF Forums). There are no licensing costs or restrictions associated with the standard.

Secondly, ePub is built on top of existing standards. This is perhaps the most important difference between ePub and  PDF and the one most likely to guarantee the ultimate success of ePub.

An ePub file is actually a zip archive, containing multiple files. You can examine the contents of any ePub file simply by renaming it to a .zip file and opening it with any tool or OS that supports the zip archive format (e.g. Windows XP and above, Winzip, gzip, 7-zip, etc).

Inside a typical ePub file, you will find the following types of file.

  • metadata.opf - An xml file containing information about the ebook, such as the author, publisher, title and a list of all the other files in this ePub file.
  • toc.ncx  - A table of contents for the ebook.
  • One or more html pages, containing the ebook text.
  • Any images used in the ebook, such as a cover image, and images that accompany the text. Images are stored in standard formats such as jpeg.

Notice that the standard file formats used to build the web, such as xml, html, jpg are used rather than any new, ePub-specific formats. The benefits of this approach are immeasurable. Support for all these formats is built into every modern operating system and programming language. The technology required to create an ePub reader application is the same as that required to display a web page and just about any modern computing device, be it a PC or more importantly a mobile device ships with this technology.

One of the things I love about ePub is that all text is represented in text files, be it html or xml. There is something incredibly reassuring about being able to open a file in a basic text editor and view or edit it.

Finally, the ePub format is DRM-free. This means that anyone purchasing an ePub file can rest assured that they have full access to the content, and are free to convert it to any other format, transfer and display it on any device, print it and importantly in this case, convert it to speech. Compare this to the sad state of affairs that plagues the Amazon Kindle now that publishers are disabling the TTS feature on more and more of their titles.

The ePub format doesn’t preclude the use of DRM but fortunately to date no one has come up with a DRM scheme for ePub. Personally I hope this never happens and all titles released as ePub remain completely open. At the moment buying an ebook in ePub format is a guaranteed way of knowing the title is completely DRM-free.

Unfortunately, the above two paragraphs are just my wishful thinking. There are already at least 2 DRM’ed ePub formats in existence, as pointed out by Keith Fahlgren in the comments below :(

So far I’ve found developing a reader for the ePub format a relatively painless process. This is primarily due to the fact ePub uses existing standards. I’ve got a complete toolbox of software libraries I can use to read images, html, xml and zip archives. This lets me concentrate on building the text to speech conversion process. The nice thing about xml and html is that my reader can easily ignore information that’s not needed by my application. This makes for a more robust reader. Superfluous or unexpected information is simply skipped over.

I’m very excited about adding ePub support to Text2Go and can’t wait to release it. I have my eye on the ever growing collection of ebooks published on Smashwords.com, all available in ePub format.

I believe the openness of ePub coupled with the ease of implementing an ePub reader makes for a very bright future. I can only see more and more companies, organisations and individuals embracing ePub, in the same way people embraced html on the web.

Please, please, publishers, choose DRM-free and choose ePub!

Sidenote: The ePub logo used above is not an official logo. Instead it’s one of a series of free public domain logos provide by Threepress Consulting. The sad fact is that ePub still doesn’t have an official logo. David Rothman of Teleread.org lamented this fact almost two years ago.

May

11

Change in announcement frequency

By Mark Gladding

Since starting this site my goal has been to announce one new ebook every day. I’ve managed to achieve this since the site began in early November, over 210 announcements ago. However of late I feel like things have become a little too mechanical. I’ve decided to relax my policy of announcing an ebook every day. Instead I’m going to be a little more discerning in what appears on the site and aim for a more reasonable two announcements per week.

Preference will still be given to authors or publishers who create their own announcements.

Less announcements will have a couple of benefits I can see.

  1. More time to search out interesting ebooks.
  2. Announced ebooks will appear the the top of the site for longer.
  3. Less likely to induce RSS fatigue in those that subscribe to the eBooks Just Published theme.

I’m going to trial this new approach for a few weeks. I’d love to hear what you think.

Mar

26

The Amazon Kindle Text to Speech Fiasco

By Mark Gladding

Amazon’s extraordinary cave in to the Authors Guild over the Kindle 2’s text to speech feature says more about the importance Amazon places on ebooks and its own Kindle Reader than anything else.

For those who haven’t been following the story, Amazon recently introduced an “experimental” text to speech feature with its new Kindle 2. This allowed any book to be read aloud using a built in computer voice. Computerized speech is nowhere near as good as a real human narrator but it’s quite understandable and great for listening to ebooks while driving, doing manual work and of course a lifesaver for the visually impaired. Why they labeled it “experimental” is any one’s guess but here are a few reasons that come to mind.

  • Development wasn’t sure they could complete it by the Kindle shipping date and it only made it in at the last minute.
  • Product management wasn’t sure a computerized voice would be well received by readers.
  • Not enough real world testing had been done prior to shipping to know with confidence that it would work well in practice.
  • Perhaps they had an inkling there could be some legal fuss.

Well it turns out the text to speech did work, certain readers found it especially useful, some going so far as to buy a Kindle just for this feature. Even the press were kind. I doubt there was a single article on the Kindle 2 that didn’t mention the new text to speech functionality. Amazon seemed to be gathering some renewed interest in its Kindle Reader.

And then it all turned to shit. The Author’s Guild got wind of it and decided the Kindle’s ability to read a book aloud was infringing on the audio rights of publishers and authors. They demanded Amazon remove the text to speech feature, effectively gagging the Kindle. Now as a layperson this claim seems absurd - how is this any different from a parent reading a print book to a child or a sighted person reading aloud to the visually impaired.

Amazon’s initial response was the same. `Kindle 2’s experimental text to speech feature is legal.’ - CNET News.

Great! Reason prevails. Case closed. Lets move on. This is the 21st century.

Not so fast. A few days later came Amazon’s back-down. Publishers will now be able to decide whether their ebooks have text to speech enabled or not. This means a publisher can prevent readers from listening to their ebooks on the Kindle. Time will tell how many publishers turn on this restriction, but given they were the ones lobbying Amazon for this control in the first place, its a safe bet that most will disable text to speech for their titles.

This decision by Amazon to introduce yet more DRM can only harm their Kindle platform and once again leave readers looking around for alternatives.

The Kindle brand may have been irreparably damaged. Up until this point almost all talk surrounding the Kindle 2 had been positive. Now the talk is about nothing but Amazon’s shameful cave in to publishers or worse, siding with publishers.

To me this says the powers that be within Amazon don’t place much importance on ebooks or their own Kindle. Why else would they back down so easily. The claim by the Author’s Guild that the Kindle’s text to speech infringes publishers’ audio rights seems tenuous at best. Why not put up a fight. Let the Author’s guild take them to court and fight out a test case. Of all companies, Amazon has the resources to fund such a fight. Not only that they’d get plenty of public support and flow on publicity for such a stand.

The problem I believe is they’re still primarily wedded to the old world of print. They don’t want to jeopardize their relationships with the major print publishers. No one can accuse Amazon of not being innovative. The problem is they’ve innovated around a publishing industry that is fast becoming obsolete. Their online store is just as relevant for ebooks as for print. But order fulfillment and shipping is not longer needed. What value do existing publishers bring to the table for ebooks? Why does Amazon need to deal with them at all?

Unless Amazon can fully embrace the future of ebooks, more forward thinking companies will put them out of the book business. You only have to look at companies such as Smashwords to see the future. It won’t be too long before print becomes the exception rather than the rule. Almost all books will be purchased in electronic form. Most readers will choose to read them using an ereader. Some will prefer a printed copy and take their ebook to a local print on demand service.

This is similar to the switch to digital photography. Today everything is shot digitally, most photos are viewed on the screen and a cherished few are printed.

Amazon leaves me with mixed feelings these days. On the one hand I love the way I can order almost any book over the Internet and have it arrive at my doorstep in Australia a couple of weeks later for less than the cost of purchasing it locally. With ebooks the feelings are opposite. Instead of providing great value and service, they’re trying to lock customers into proprietary DRM-laden formats and charge them a premium for the privilege - ebooks should be a fraction of the cost of a printed book.

Come on Amazon. Wake up and see the future!