Business and technology, Google, Internet, Open Source Software

Common Crawl provides public access to its huge web index.

May 2, 2013 Russ Bellew Leave a comment

Google is a powerful search engine, as are Bing, Yandex, et al, but they’re all proprietary: their spiders crawl the web and vacuum-up information which they store within their own walls. (Google calls its web index BigTable.) Yes, we can use their search engine user interfaces, but exactly what algorithms they use remains proprietary and for the most part, secret.

Common Crawl Foundation (Commoncrawl.org) was created in 2007 with the goal of crawling the web and making the discovered information available to the public, to do with as it pleases. Common Crawl claims to have stored about six billion web pages in their index and they publish a free library of program code to access it.

Applications that use the Common Crawl index are beginning to appear. Lucky Oyster uses the Common Crawl index to reveal previously hidden social networking relationships to users.

MIT’s Technology Review published an article recently that speculates that, thanks to Common Crawl, now Google-scale start-ups can get underway without having to crawl the web themselves, dramatically reducing their need for capital. Walled gardens such as Facebook and LinkedIn block spiders from crawling their sites — they’re all about locking up information. It’ll be fun to watch the tug of war between the proprietary and the open model in the web search arena, My money is on the open model.

Leave a comment Cancel reply

Gordon Welchman was an Englishman who, while working on decoding German messages at Bletchley Park during World War II, invented traffic analysis. His idea was that even if one couldn’t decipher message contents, just tabulating who messaged whom, when, and how frequently, lent knowledge about the enemy.

After the war, he emigrated to America, where he became an American citizen and taught the first computer course at M.I.T. He worked for Remington Rand and eventually for the MITRE Corporation, where he enhanced traffic analysis technology and helped develop C³ (Command, Control, and Communication) systems.

Following the publication of his book The Hut Six Story in 1982, which detailed the work of his Hut Six group at Bletchley Park, his security clearance was revoked. This killed his career in intelligence.

Today we call the information that surrounds a message “metadata”.

Rate this:

Related

Gordon Welchman, Codebreaker
August 16, 2017 Russ Bellew Leave a comment
Well, both snowboarding and skiing. Reminds me of water skiing. The middle east coast states received buckets of snow this weekend and these New Yorkers made the best of it. Yes, that looks like Broadway in Times Square.

Rate this:

Related

Snowboarding NYC, Jan 2016
January 24, 2016 Russ Bellew Leave a comment
Nova: The 2000 Year-Old Computer – Decoding the Antikythera Mechanism (first aired on American PBS television network in 2013)

“It re-writes the history of technology.”

The Antikythera Mechanism Research Project

Jo Marchant explains: In search of lost time

Rate this:

Related

A 2000 year old computer
January 23, 2016 Russ Bellew Leave a comment
I love this parody. It’s a humorous advertisement for your own mail server:

Do you run a government agency but hate complying with the law? Then you need DC Matic, the Hillary Clinton-approved email server!

credit: Written and performed by Remy. Video directed and edited by Meredith Bragg

What’s Hillary hiding? Classified emails? Sure. Evidence of her negligence in Benghazi that led to the murders of US citizens? Of course. Security breaches via assistant Huma Abedin’s Muslim Brotherhood connections? Probably. No, the ticking time bomb in this server is bribery. Maybe treason as well. She’s hiding written evidence of her deals that traded State Department help in exchange for large donations to the Clinton Foundation and large fees for speaking engagements by Bill Clinton.

Rate this:

Related

Hillary Clinton-approved email server
August 18, 2015 Russ Bellew Leave a comment
Today’s laughably corrupt Obama regime is life imitating art. Which art? Cinema. Whacked-out cinema. Farce. Namely, 1969’s Putney Swope:

Both Swope and Obama were elected to office by fools who suffer from chronic white guilt.

In 1969, Putney Swope announced:

The changes I’m gonna make will be minimal. I’m not gonna rock the boat. Rockin’ the boat’s a drag. What you do is sink the boat.

In 2008, Barack Obama bragged:

. . . we are five days away from fundamentally transforming the United States of America.

Mr. Obama is trying to transform America, alright. Transform it from a prosperous capitalist economy governed by a constitutional republic to a bankrupt socialist economy governed by a corrupt tyrannical dictatorship. Barack is following Putney’s credo, “What you do is sink the boat.”

Rate this:

Related

Putney Swope and Barack Obama
August 14, 2015 Russ Bellew Leave a comment
Here’s a funny video that’s supposedly an Iranian negotiator recounting the stupidity of Messrs Obama and Kerry:

The sardonic subtitles were added by Madison McQueen. (She apparently is the granddaughter of actor Steve McQueen.)

The best line:

I’ve had tougher negotiations over a falafel sandwich!

I first saw this sort of humor in a video clip taken from a biography of Hitler: United broke Hitler’s guitar, too!.

I like this sort of humorous subtitling. And I like its message: Obama and Kerry are fools.

Rate this:

Related

YouTube subtitle humor
August 7, 2015 Russ Bellew Leave a comment
Check out this video:

The tune, “Slow Down”, is performed on piano and sung by its composer, Larry Williams. He was from New Orleans (of course). The tune, ringing with ninth chords, was released on disc in 1958. I think that the dancers are from a 1950s Hollywood rock & roll movie. Larry also composed Dizzy Miss Lizzy, Bad Boy, and Bony Moronie — classic rock tunes, all. He was born in 1935 and died on this date, January 7, in 1980.

In the mid-1950s, Williams inherited star billing from Little Richard (who’d forsaken rock and roll for religion) at New Orleans’ record label Specialty Records.

While Williams was alive, the Beatles paid their respects by admirably covering Larry’s Dizzy Miss Lizzy, Slow Down, and Bad Boy. I’m amazed that Larry Williams isn’t in the Rock and Roll Hall of Fame.

Extra credit assignment: Compare and contrast the Beatles’ cover of Slow Down with Larry Williams’ original. This clip includes the fab four wailing in Liverpool’s Cavern Club: (If YouTube has taken down this video clip, you can hear the same recording with groovy rock and roll clips (sorry — requires Flash) from 1950s America and early Beatles. Sorry for the Flash format.)

Visit my website: http://russbellew.com
© Russ Bellew · Fort Lauderdale, Florida, USA · phone 954 873-4695

Rate this:

Related

Wanna see jitterbug? Wanna hear rock & roll?
January 7, 2015 Russ Bellew Leave a comment
I’m delighted to discover that the video of Joni Mitchell’s classic Shadows and Light concert (1980) can be viewed in full (1h 13m) on YouTube. Supporting players are Jaco Pastorius on bass, Pat Metheny on guitar, Michael Brecker on sax, Don Alias on drums, Lyle Mays on keyboards, and The Persuasions. It’s among my favorite videos of a concert performance.

Jaco Pastorius

Jaco was a Fort Lauderdale kid who began playing in rock bands around town in a variety of clubs: She, The 4 O’Clock CLub, The Village Zoo, The Flying Machine, The Button, Bachelors III, Ocean Mist . . . When I first heard Jaco in the early 1970s, he was playing bass for straight-ahead local rock bands. He graduated to more jazz- and fusion- related music and put his unique fretless Fender bass stamp on Weather Report. I’ve heard bass players tell me that they tried to imitate Jaco’s technique, but gave up trying; they claim that Jaco changed what it meant to play electric bass guitar. Jaco’s friend Pat Metheny, who plays a beautiful lead guitar in this concert, is a University of Miami music school graduate.

Jaco seemed to still have his act together when he played this concert. Wikipedia has a good Jaco biography. He had a rapid rise to the top followed by a quick ride back down again. I had musician friends c 1984-87 who were torn up watching their friend Jaco dismantle his life. This Warner Brothers recording artist and Down Beat Hall of Fame member was sleeping on park benches and shooting baskets in a local public park.

Michael Brecker and Don Alias died a few years ago.

This is a classic performance by master musicians who were at the top of their games. Too bad it couldn’t last forever.

Rate this:

Related

Joni Mitchell, Jaco Pastorius, et al on YouTube
September 11, 2014 Russ Bellew Leave a comment
According to Rolling Stone magazine, the FCC is considering disciplining NBC for airing an indecent performance on July 6, Miley Cyrus’ “Bangerz Tour”. I watched it. It was provocative, but artful. Bertolt Brecht would have loved the production: live dancers against rear-projection oversized animation with creative costumes and lighting. I loved it. Some of the images, such as Miley riding a giant “Mr. Wiener”, were sexually suggestive.

Click to stream or download full 862 Megabyte video performance

The concert (recorded in Barcelona) reminded me of Madonna’s shows twenty-five years ago. Both performers have acceptable contralto voices, energetic dance skills, and assemble exciting Brechtian spectacles. I love the costuming and choreopgraphy. Shocking? “Bangerz” pushed the limits on prime-time American TV, I suppose. But that week on television, the atrocious performance by the Brazilian football team was truly shocking.

I’d prefer that the FCC take no action on this. They have enough serious issues on their plate already. Censoring art is, in my opinion, a slippery slope for any government agency . . . and I think that this production can be labeled “art”. Here’s the full show (862MB H264 1h 25m mp4 video file, 720 x 404 pixel) for download or streaming:

Click to stream or download full 862 Megabyte video performance

You’ll need a fast Internet connection to smoothly stream this. You might be better to download the file and then play it locally with a good video player such as VLC.

Is it Miley’s performance or just modern low distortion recording technique that for the first time makes John Lennon’s “Lucy in the Sky with Diamonds” lyrics (at 44m 35s) sound so . . . so . . . clear, logical, and complete?

What do you think?

Rate this:

Related

FCC may fine NBC for Miley Cyrus “Bangerz” show
August 6, 2014 Russ Bellew Leave a comment
I’ve worked with integrated circuits (I.C.s) since the 1960s, but haven’t been involved in their manufacture — only their application.

Intel Haswell wafer with a pin for scale
photo: Intel Free Press
Today’s integrated circuit manufacture is a high stakes capital intensive business whose players use trade secrets to maintain their market advantage. I’ve never been inside an I.C. “fab” (factory), so it was a treat to find an hour-long presentation by an industry manufacturing engineer on YouTube. The technologies used at nano dimensions are mind-boggling.

Here’s the excellent presentation, in full:

The speaker mentions that lithographic imaging of the mask is now being done at 193 nanometer (nm). As you can see, we’re well above visible light and on our way to x-rays(!). Here’s the electromagnetic spectrum in that region:

Click for full-size
graphic by: Shigeru23

The presentation is aimed at the layperson and is filled with surprises. For instance, one gigabyte of semiconductor memory can be produced on a flat substrate within the diameter of a human hair. I give it two (gloved) thumbs up.

Rate this:

Related

How are modern integrated circuits manufactured?
July 28, 2014 Russ Bellew Leave a comment