Google books how does it work




















But in reality, Google is so far ahead that none of them is likely to catch up. The consensus among observers is that it cost Google several hundred million dollars to build Google Books, and nobody else is going to spend that kind of money to perform the feat a second time. They have a focused commitment around books, unencumbered by distractions like running one of the largest advertising businesses in the world or managing a smartphone ecosystem.

In popular mythology, interminable lawsuits turn into hungry maelstroms that drown the participants. Jarndyce from Bleak House , the generations-spanning estate fight whose legal fees eat up all the assets at stake. In the tech business, court battles like the celebrated antitrust suit that plagued IBM for years tend to pinion giant corporations and provide new competitors with an opening to lap an incumbent.

Google itself rose to dominate search while Microsoft was busy defending itself from the Justice Department. It taught Google something valuable. In a sense, the company behaved like the Uber of intellectual property — a kind of read-sharing service — while expecting to be seen the way it saw itself, as a beneficent pantheon of wizards serving the entire human species.

It was naive, and the stubborn opposition it aroused came as a shock. Sometimes you have to play politics, too — consult stakeholders, line up allies, compromise with rivals.

It grew up. That takes a chance encounter between the protagonist and a particular book that provides an illuminating insight. Breaking a challenge into approachable pieces, turning it into data, and applying efficient routines is a powerful way to work.

The hard labor is still ahead. To date, the full experience of reading a book requires human beings at both ends. Answer: Whenever you can see more than a few snippets of an in-copyright book in Google Books, it's because the author or publisher has joined our Partner Program and granted us permission to show you the Sample Pages View, which helps you learn enough about a book to know whether you want to buy it.

This is something we do with a publisher's explicit permission. Question: If a book is still under copyright, is scanning it actually legal? Answer: This is probably the most common misconception about Google Books, and about copyright law in general. The "fair use" provisions of U.

Fair use is designed to safeguard copying that doesn't harm people's incentive or ability to produce and sell creative work, including books.

We've carefully designed Google Books to make sure our use of books is fair and fully consistent with the law. There are some works that librarians have to take special care of to prevent their falling apart. In short, Google Books could mean better access to more information for more people than ever before. It could revolutionize the Internet in ways that we can't yet imagine.

But as with all revolutions, the Google Books project is not without controversy. Citizens, politicians and companies from around the world have justifiable concerns about privacy, copyright law and antitrust issues related to Google Books.

Keep reading to see how Google quickly scans millions upon millions of pages of books, and how some people are doing everything they can to handicap this daring project.

It goes without saying that scanning millions of books is a gargantuan undertaking. The technical challenges alone are significant. Traditional scanning equipment uses a glass plate that completely flattens each page, ensuring that OCR optical character recognition software is able to identify the letters and numbers printed on the pages being digitized. Once scanned, those characters can be edited and searched with a computer.

To eliminate the need for glass plates and reduce the possibility of damage to the books it wants to preserve, Google patented a new book scanning process. Workers simply place the book on an open book scanner that has neither a glass plate nor any other equipment that would flatten a book.

Google's advanced software scans the book and accounts for curvature of the pages, meaning there's no degradation of character recognition.

The scanners work at a rate of about 1, pages per hour. Google developed agreements with major libraries to start the project. With the help of these institutions, Google has already scanned around 12 million books [source: von Lohmann ]. The project's expansiveness means that its greatest promise is granting access to books that people would otherwise never see.

A student in Florida can access a special Native American collection on the other side of the country. People who can't afford to travel to see ancient texts in France might browse those tomes from their living rooms. And thanks to Google's extra efforts, a visually impaired person can view books on enlarged displays, use Braille equipment, or listen to documents through read-aloud technology.

Initially, Google Books planned to digitize only works in the public domain, which made up about 20 percent of all books [source: Toobin ]. In the United States, books enter the public domain 70 years after the author's death; as public domain, they're no longer protected by copyright. However, as Google scanned, it began digitizing even copyrighted texts. The company didn't put copyrighted materials online in their entirety, instead limiting online contents to about 20 percent of the book's contents.

Google claimed this was a fair use of copyrighted materials. Others strongly disagreed. The Authors Guild and the Association of American Publishers filed a class-action lawsuit , fueling controversy about Google Books in the United States and around the world. Copyright , access and profit issues are at the center of the Google Books debate. Rights holders want more control over distribution of their work, and they also want part of the profits that Google generates from its digital archive.

Google, on the other hand, wants more control over the information it is digitizing -- with better control, Google Books would not only become the world's biggest library, it could be the world's biggest bookstore, too. This is deeply Google thinking but without the dominant algorithm. It's a Google subspecies that evolved by feeding on a different corpus. There is less data about books than web pages, but there is more structure to it, and there's less spam to contend with. Yet the focus on optimizing an experience from vast amounts of data remains.

The most difficult part of making Google Books work, said James Crawford, the team's engineering director, was determining the intent of the service's heterogeneous user base. Scholars who search Google Books have very different wants and expectations from casual users looking to find a trade fiction title. Sometimes they are looking for information about that book.

Third, they want to buy a copy of that book," Crawford said. Rich Results will help people who are looking specifically for a title, but Crawford said that they aren't ruling out other presentations or features for other user types e.

All the Google Books tweaks I've noticed are small.



0コメント

  • 1000 / 1000