SEO Strategy for Semantic Search
The advent of blended search in 2007 forced SEOs to rapidly reevaluate their strategies. Blended search, which incorporates video, images, news and other result categories into general Web search results, meant that along with keyword-rich content, engagement objects represented an additional opportunity for SERP real estate.
Another potential game changer, semantic search has been buzzed about for a long time already, and with big boy Google beginning to implement some semantic technology, it looks like semantic search could be making its move to the mainstream. So is it time to rethink SEO strategy again, this time for semantic search?
Yesterday I wrote about Ask.com and their progressive search technology, including impressive strides in semantic search. One commenter, Phill Midwinter (aka SEO Phill), brought up the implications semantic search could have on the search engine optimization community:
What I find odd is that despite the improvements taking place with semantic search technology, the majority of SEOs still insist on using a keyword strategy that doesn’t take semantics into account.
ASK is doing some great work and has been innovating steadily for some years now – but the recent slew of semantic improvements from Google are also being taken at face value and strategy isn’t changing.
You only need to look at the recent PageRank sculpting fiasco […] to see that SEOs are losing touch instead of keeping up.
Well isn’t that a frightening prediction. I decided that a semantic search SEO strategy was worth a deeper look, so I pinged Phill to find out more. And omgosh, I had a lot to learn.
Where’s Semantic Search Today?
First some background. Latent Semantic Analysis (LSA) uses statistical algorithms to figure out what Web content is about based on the words used and their relationships with other words on the page. By comparing all the words and they way they’re used, LSA can figure out what a page is actually about and base result rankings on that as opposed to the keyword matching system that powers search today. In this way a search engine using LSA can return the most relevant results — regardless of query word order and regardless of whether or not the result actually includes the queried terms.
Of course, none of today’s major search engines work solely from latent semantic indexing. Rather, LSA is being used on top of the more traditional ranking factors. Phill explained to me (emphasis mine):
The slew of recent upgrades [Google] released are all based on this LSI layer over the top of their existing index. What that means, is that Google is storing an “ontology” or a semantic profile. […] They expose their ontology in four key areas right now: the Wonder Wheel (limited), search suggestions at the bottom of the SERP, AdWords Keyword Tool and their search-based keyword tool.
Off-Page SEO Recommendations
So this is where things get science-y. Phill explained that if you collate the data from the four sources above, you’ll have yourself a map of Google’s semantic profile for a specific Web site. When Phill took this profile and compared it to a map of the site’s inbound links, it turns out the two shapes match! So Phill deduced that Google’s using LSA to analyze link equity as well as for serving queries. It that’s the case, an SEO can use a site’s semantic profile as a guide for the link building strategy. Of course, SEOs have always been concerned with getting the most relevant links to a site, but by using a semantic profile one can tailor the shape of a site with the help of a handy guide map. Oh, and of course, as LSA becomes more predominant, anchor text won’t be the end-all-be-all of link value — all the content on the linking page (or even site) will hold sway.
On-Page SEO Recommendations
And this is where it gets a little scary. Search engine optimizers have become pretty accustomed to using a keyword list as a guide for their on-page SEO efforts. But because LSA is looking at the content on the site as a whole and the relationship between different concepts, trying to fire at a narrow list of keywords is not the best approach. Specific keyword mentions aren’t the key, because as far as LSA is concerned, a truly strong site will cover a topic in a natural and holistic way. Phill recommends that the concise keyword list be transformed into a list of thousands of long tail keywords that guides the content of the site rather than dictates it. By providing broader content topics for copywriters to write about, the copywriter can write naturally, following the guidelines of the site’s ideal semantic profile. According to Phill, this approach will result in a site that is based on the way people actually think and search. With the broad shape of the site in place, tailoring can be done through the site’s inbound links.
Search may not depend on semantic technology now, but I do believe it’s worth thinking about today.
4 Replies to “SEO Strategy for Semantic Search”
so what happened then to the siloing thing that Bruce Clay came up with before. If LSI the thing that works and should be embraced that would override Bruce Clays siloing to be very focused on each page or?
Virginia here again. I got an email from David Harry which adds another dimension to the discussion. He said I could share some of it here. Snip:
Now, first of I am also happy we didn’t get into the whole LSI and Google
mantra that SEOs have been yapping about for a few years. That all started
with the purchase of Applied Semantics back in 03. Ye can read all about
that here; http://www.huomah.com/search-engines/algorithm-matters/stay-off-the-lsi-bandwagon.html.
Now, since we’re passed that, I am more inclined to think along the lines of
PLSA (as well as LDA and HTMM) as the engineers over there did seem to have
a fancy for it in 07; http://googleresearch.blogspot.com/2007/09/openhtmm-released.html
Also, there was mention of the semantics relating to the Google expanded
snippets. That part, while possibly using some form of semantic analysis, is
part of the Orion algorithm from Ori Allon;
http://www.huomah.com/Search-Engines/Algorithm-Matters/New-Algo-Changes-at-Google.html
And of course we’d be remiss if we didn’t look at Anna Patterson’s – Phrase
based indexing and retrieval (formal Googler now running Cuil) which are
also potential pieces to the semantic puzzle at Google;
http://www.huomah.com/search-engines/algorithm-matters/phrase-based-optimization-resources.html
[…]
I do agree will Phill’s comments though, understanding how search engines
are using semantic analysis (and other methods) to define concepts is
something not well discussed in the industry. Considering that the lion’s
share of search each day are ambiguous and new, (according to Google), they
are constantly toying with signals such as semantic analysis (which works
hand in hand with query analysis) to better understand what the user is
looking for.
But…
“So Phill deduced that Google’s using LSA to analyze link equity as well as
for serving queries. ” – well, this once more doesn’t prove LSA… but some
type of theme/concept categorization (possibly). Since we know LSI was for
ad matching, and Phrase Based IR was for the organic (both puchased the same
year) we could also (incorrectly) make the assumption that those link
valuations are what is causing this. But wait, early the next year (2004)
they purchased Kaltix and their Personalized PageRank approaches… also
widely believed to be part of the anchor text valuations that hit in
’05-06… hmmmmm…
http://www.huomah.com/Search-Engines/Search-Engine-Optimization/Personalized-PageRank-a-user-sensitive-affair.html
Pont being, that we can’t jump the gun and start calling Google semantic
analysis any one flavour at this point. Applied Semantics was an ad matching
technology using LSI; we can’t assume it was added to the reg search
processes (phrase based approach was the same year). I just wanted to add
some perspective lest we get another spat of SEOs peddling ‘Google LSI
Compliant’ serices once more (circa 2006).
M’kay?
Personally, I work with building out semantic concepts with my SEO programs,
but I don’t limit my study of them to one aspect (such as LSA/I). Many times
I talk to people about losing the old school ‘KW density’ and to think more
in core, secondary and related phrase ratios. This applies to not only
on-page factors, but links as well (where possible). To that end I whole
heartedly agree with the core assertions being made in the post, ust wanted
to clear up some of Google’s history and broaden the potential suspects (and
steer the crap SEOs away from the re-birth of Google LSI Programs)…
Appreciate the science-y part, but what about sites that don’t necessarily stick to a single topic? CNN is about money, news, celebrity, etc., ad nauseum. Their footprint is going to be much different (and potentially detrimental) caompared to a site that is specifically about a single topic.
Regardless, fascinating stuff. Easily the best article I’ve read this week. Thanks for pulling this together.