Here's How to Search The Deep Web

Category: Search-Engines

There are over 130 trillion individual Web pages as of this writing. But that's only the very tip of the iceberg. Beyond what popular search engines offer up, there is a universe of information that's online and discoverable -- if you have the right skills and tools. Here's how to gain access to the rest of the Web...

What's Out There on the Deep Web?

One hundred and thirty trillion is a big number. Back in 2016, Google's "How Search Works" page said the number of web pages had doubled since 2013, and in 2008 there were "only" one trillion Web pages. Clearly, none of us will ever lack for reading material. And that's only the beginning of what's available online.

General search engines like Google, Bing, and Yahoo! index only the "surface Web," pages that have unique URLs such as the one that appears in your browser's address box right now. They don't even index (catalog) all of the surface Web. Some website owners prefer not to have their sites (or portions of them) indexed, so they use a file called robots.txt that tells search engines, "Don't include this page in your database."

Search engines choose to exclude many other surface Web pages from their indexes for a variety of reasons including relevance, legality, and violations of search optimization policies. Other pages are locked behind passwords, intended only for those who are granted access. If you're curious what the very first web page looked like, it's still there. That page was published on August 6, 1991 by British physicist Tim Berners-Lee, who invented the World-Wide Web. (See my Brief Internet History Lesson.)

Searching the Deep Web

Beneath the surface Web lies the "Deep Web," a mass of information hundreds of times greater than the trillions of surface pages discovered by Google. By their very nature, Deep Web resources cannot be found by the web-crawling software that search engines use to find and index pages.

Note that the "Deep Web" is not the same as the "Dark Web" where criminals lurk. On the Dark Web, everyone tries to hide their identities as well as what they're doing. Unless you're the type that wanders down dark alleys at 2 AM, you'll want to steer clear of the Dark Web. The Deep Web consists of perfectly legitimate information and its users.

Some of these Deep Web pages can be accessed only by a user clicking or manually typing a link that's not been indexed by a search engine. Other Deep Web pages can be accessed only by a user who directly enters a query in a search form. The desired data exist in a database, not on a Web page that a crawler can find by following links from other pages. The data retrieved in response to a user query is displayed as a "dynamic" Web page that lasts only until the user moves on. When you search for products on an ecommerce site like Amazon, the results are dynamic pages.

Some Deep Web Search Tools

When Google or Bing don't serve up the results you need, try a specialized search tool. My article Free Online Research Tools will point you to dozens of them, categorized by subject matter. You may also want to try an alternative search engine. See my article The Other Search Engines for a list of those.

The Library of Congress Online Catalog is a good example of a Deep Web resource. It's a database containing millions of records of books, periodicals, audio recordings, photographs, and more. None of its records can be retrieved directly through Google. You need visit the LOC's Online Catalog page and enter search terms in the appropriate boxes.

Web archives such as The Wayback Machine store copies of Web sites that have been modified or deleted. Such archived pages are not indexed by search engines, which strive to index the current version. I've often found the Wayback Machine handy when I want to see what a website looked like in the past. (Want to see what Yahoo.com looked like in October of 1996? It's in there.)

To find Deep Web material via Google, et. al, try adding the term, "database" to your search query. "Plane crash database," "drug interaction database," "government grants database," and so on, will often lead to the home page of a database where you can enter search terms specific to that resource.

There are also paid tools such as LexisNexis and Factiva which professional researchers use to find information about legal and business topics. Genealogy researchers can find a wealth of free information online, but often the best sources require payment. Ancestry.com is one such example. It's also becoming more common for online newspapers and magazines to limit free content, and erect paywalls that require a subscription to view more than current headlines.

General search engines suffice for most needs. Scholars, journalists, and other serious researchers often must resort to the Deep Web. Do you use any of these tools to access online information that's not available with a quick Google search? Got any search tips of your own to share? Your thoughts on this topic are welcome. Post your comment or question below...

 
Ask Your Computer or Internet Question

  (Enter your question in the box above.)

It's Guaranteed to Make You Smarter...

AskBob Updates: Boost your Internet IQ & solve computer problems.
Get your FREE Subscription!


Email:

Check out other articles in this category:



Link to this article from your site or blog. Just copy and paste from this box:

This article was posted by on 2 Jul 2019


For Fun: Buy Bob a Snickers.

Prev Article:
[CLICK] Is That Link Safe?

The Top Twenty
Next Article:
Free Graphic Design and Drawing Software

Most recent comments on "Here's How to Search The Deep Web"

Posted by:

FrancesMC
02 Jul 2019

This is absolutely irrelevant to your comments but, in the library world, the Library of Congress is universally known as LC.

When I went to Library School many, many years ago, it was several weeks before I discovered that the LC that people were talking about was the Library of Congress. At that time, and for many years later, LC published its holdings in large and heavy bound volumes. I wonder what happened to them now that everything is online?


Posted by:

Joan
02 Jul 2019

FrancesMC: I think you're referring to the NUC (National Union Catalog) volumes - the big green ones, yes? Libraries that keep them around often stack them up in the shape of a Christmas tree around holiday time. Other than that, they don't have a lot of usefulness...


Posted by:

Lee
03 Jul 2019

A few years ago I looked and the NUC was/is available on CD...our library did not have the money. There is also OCLC which has web access to most books and materials cataloged in its catalog, which can be searched by the public.


Posted by:

Bob Kinsler
07 Jul 2019

Interesting this all started in August 1991. I hate to date myself, but I thought it started way before then.

But I have to remember the systems I was involved with was US Military and the suggestions about using satellites for transmission versus telephone lines or wires was for the betterment of the US Military systems.


Post your Comments, Questions or Suggestions

*     *     (* = Required field)

    (Your email address will not be published)
(you may use HTML tags for style)

YES... spelling, punctuation, grammar and proper use of UPPER/lower case are important! Comments of a political nature are discouraged. Please limit your remarks to 3-4 paragraphs. If you want to see your comment posted, pay attention to these items.

All comments are reviewed, and may be edited or removed at the discretion of the moderator.

NOTE: Please, post comments on this article ONLY.
If you want to ask a question click here.

Free Tech Support -- Ask Bob Rankin
RSS   Add to My Yahoo!   Feedburner Feed
Subscribe to AskBobRankin Updates: Free Newsletter
Copyright © 2005 - Bob Rankin - All Rights Reserved
Privacy Policy


Article information: AskBobRankin -- Here's How to Search The Deep Web (Posted: 2 Jul 2019)
Source: https://askbobrankin.com/heres_how_to_search_the_deep_web.html
Copyright © 2005 - Bob Rankin - All Rights Reserved