Skip to content

Commit

Permalink
New UrlMetrics fields, and support for calculating 'missing' fields.
Browse files Browse the repository at this point in the history
Added constants for all of the new fields now returned by UrlMetrics.

Added a list of fields whose values are not returned by UrlMetrics,
but which can be calculated from other fields. For example,
followed internal links can be calculated from total followed links
minus followed external links. If the neccessary fields are present
in the response, the calculated fields will automatically be added.
If you request the calculated fields, the request will actually be
for the fields that it depends upon.

[Internal tracking: TUBO-869]
  • Loading branch information
Martin Tithonium committed Dec 17, 2010
1 parent ecccf5d commit e90eb44
Show file tree
Hide file tree
Showing 8 changed files with 490 additions and 133 deletions.
109 changes: 83 additions & 26 deletions README.rdoc
Original file line number Diff line number Diff line change
Expand Up @@ -132,40 +132,98 @@ Depending on the type of data point return, you may access certain data points i

=== Source/Target/URL Metrics

* <tt>:status</tt>
* <tt>:fq_domain_mozrank</tt>
* <tt>:pl_domain_links</tt>
* <tt>:pl_domain_external_links</tt>
* <tt>:all_external_links</tt>
* <tt>:canonical_internal_id</tt>
* <tt>:canonical_url</tt>
* <tt>:cblocks_linking</tt>
* <tt>:domain_authority</tt>
* <tt>:domain_authority_raw</tt>
* <tt>:external_links</tt>
* <tt>:external_mozrank</tt>
* <tt>:external_mozrank_raw</tt>
* <tt>:fq_domain</tt>
* <tt>:fq_domain_all_external_links</tt>
* <tt>:fq_domain_external_links</tt>
* <tt>:fq_domain_external_mozrank_sum</tt>
* <tt>:fq_domain_external_mozrank_sum_raw</tt>
* <tt>:fq_domain_fq_domains_linking</tt>
* <tt>:fq_domain_internal_links</tt>
* <tt>:fq_domain_juice_fq_domains_linking</tt>
* <tt>:fq_domain_juice_internal_links</tt>
* <tt>:fq_domain_juice_links</tt>
* <tt>:fq_domain_juice_pl_domains_linking</tt>
* <tt>:fq_domain_links</tt>
* <tt>:fq_domain_mozrank</tt>
* <tt>:fq_domain_mozrank_raw</tt>
* <tt>:fq_domain_mozrank_sum</tt>
* <tt>:fq_domain_mozrank_sum_raw</tt>
* <tt>:fq_domain_moztrust</tt>
* <tt>:fq_domain_moztrust_raw</tt>
* <tt>:fq_domain_pl_domains_linking</tt>
* <tt>:pl_domain</tt>
* <tt>:url</tt>
* <tt>:fq_domain_unfollowed_external_links</tt>
* <tt>:fq_domain_unfollowed_fq_domains_linking</tt>
* <tt>:fq_domain_unfollowed_internal_links</tt>
* <tt>:fq_domain_unfollowed_links</tt>
* <tt>:fq_domain_unfollowed_pl_domains_linking</tt>
* <tt>:fq_domain_updated_at</tt>
* <tt>:fq_domains_linking</tt>
* <tt>:internal_id</tt>
* <tt>:internal_links</tt>
* <tt>:ips_linking</tt>
* <tt>:juice_cblocks_linking</tt>
* <tt>:juice_fq_domains_linking</tt>
* <tt>:juice_internal_links</tt>
* <tt>:juice_ips_linking</tt>
* <tt>:juice_links</tt>
* <tt>:juice_pl_domains_linking</tt>
* <tt>:links</tt>
* <tt>:mozrank</tt>
* <tt>:mozrank_raw</tt>
* <tt>:moztrust</tt>
* <tt>:moztrust_raw</tt>
* <tt>:page_authority</tt>
* <tt>:page_authority_raw</tt>
* <tt>:pl_domain</tt>
* <tt>:pl_domain_all_external_links</tt>
* <tt>:pl_domain_cblocks_linking</tt>
* <tt>:pl_domain_external_links</tt>
* <tt>:pl_domain_external_mozrank_sum</tt>
* <tt>:pl_domain_external_mozrank_sum_raw</tt>
* <tt>:links</tt>
* <tt>:external_mozrank</tt>
* <tt>:pl_domain_internal_links</tt>
* <tt>:pl_domain_ips_linking</tt>
* <tt>:pl_domain_juice_cblocks_linking</tt>
* <tt>:pl_domain_juice_internal_links</tt>
* <tt>:pl_domain_juice_ips_linking</tt>
* <tt>:pl_domain_juice_links</tt>
* <tt>:pl_domain_juice_pl_domains_linking</tt>
* <tt>:pl_domain_links</tt>
* <tt>:pl_domain_mozrank</tt>
* <tt>:juice_links</tt>
* <tt>:title</tt>
* <tt>:fq_domains_linking</tt>
* <tt>:page_authority</tt>
* <tt>:fq_domain_external_mozrank_sum_raw</tt>
* <tt>:pl_domain_moztrust</tt>
* <tt>:fq_domain_external_links</tt>
* <tt>:domain_authority_raw</tt>
* <tt>:canonical_url</tt>
* <tt>:pl_domain_mozrank_raw</tt>
* <tt>:pl_domain_mozrank_sum</tt>
* <tt>:pl_domain_mozrank_sum_raw</tt>
* <tt>:fq_domain_links</tt>
* <tt>:all</tt>
* <tt>:mozrank</tt>
* <tt>:pl_domain_moztrust</tt>
* <tt>:pl_domain_moztrust_raw</tt>
* <tt>:pl_domain_pl_domains_linking</tt>
* <tt>:external_links</tt>
* <tt>:fq_domain_fq_domains_linking</tt>
* <tt>:pl_domain_unfollowed_cblocks_linking</tt>
* <tt>:pl_domain_unfollowed_external_links</tt>
* <tt>:pl_domain_unfollowed_internal_links</tt>
* <tt>:pl_domain_unfollowed_ips_linking</tt>
* <tt>:pl_domain_unfollowed_links</tt>
* <tt>:pl_domain_unfollowed_pl_domains_linking</tt>
* <tt>:pl_domain_updated_at</tt>
* <tt>:pl_domains_linking</tt>
* <tt>:domain_authority</tt>
* <tt>:fq_domain_mozrank_sum_raw</tt>
* <tt>:moztrust</tt>
* <tt>:status</tt>
* <tt>:title</tt>
* <tt>:unfollowed_cblocks_linking</tt>
* <tt>:unfollowed_external_links</tt>
* <tt>:unfollowed_fq_domains_linking</tt>
* <tt>:unfollowed_internal_links</tt>
* <tt>:unfollowed_ips_linking</tt>
* <tt>:unfollowed_links</tt>
* <tt>:unfollowed_pl_domains_linking</tt>
* <tt>:updated_at</tt>
* <tt>:url</tt>


=== Link Metrics

Expand All @@ -182,7 +240,6 @@ Depending on the type of data point return, you may access certain data points i
* <tt>:text</tt>
* <tt>:internal_subdomains_linking</tt>
* <tt>:external_domains_linking</tt>
* <tt>:all</tt>
* <tt>:record_id</tt>
* <tt>:external_pages_linking</tt>

Expand Down
2 changes: 1 addition & 1 deletion Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ begin
gem.description = %Q{Provides an interface to SEOmoz's suite of APIs, including the free and site intelligence APIs.}
gem.email = %q{[email protected]}
gem.homepage = "http://github.com/seomoz/linkscape-gem"
gem.authors = ["Marty Smyth", "Jeff Pollard"]
gem.authors = ["Martin Tithonium", "Jeff Pollard", "Bryce Howard"]
gem.add_dependency "ruby-hmac", ">= 0"
# gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
end
Expand Down
147 changes: 137 additions & 10 deletions lib/linkscape/constants.rb
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,98 @@ module Constants
:desc => %Q[The pay-level domain (PL domain) as it's identified in the Linkscape index],
},

:ued => {
:key => :all_external_links,
:name => 'All external links page to page',
:desc => %Q[The number of external links from one page to another (included followed and nofollowed).]
},
:ujfq => {
:key => :juice_fq_domains_linking,
:name => 'Followed Domains Linking Page',
:desc => %Q[The number of unique domains with followed links to the target url.]
},
:ujp => {
:key => :juice_ips_linking,
:name => 'Followed IPs Linking',
:desc => %Q[The number unique IPs with a followable link to a target url.]
},
:uip => {
:key => :ips_linking,
:name => 'IPs Linking',
:desc => %Q[The total number of unique IPs linking to a target url.]
},
:ujpl => {
:key => :juice_pl_domains_linking,
:name => 'Followed Domains to Page',
:desc => %Q[The number of unique domains with followed links to a given url.]
},
:uib => {
:key => :cblocks_linking,
:name => 'All Cblock Linking',
:desc => %Q[The total number unique cblocks linking to a page.]
},
:ujb => {
:key => :juice_cblocks_linking,
:name => 'Followed CBLocks Linking',
:desc => %Q[The total number unique cblocks with followed links to a page.]
},
:fjid => {
:key => :fq_domain_juice_links,
:name => 'Followed Subdomain Linking Domains',
:desc => %Q[A count of all unique subdomains with followed links to the target domain.]
},
:fed => {
:key => :fq_domain_all_external_links,
:name => 'Subdomain External Links',
:desc => %Q[The total number (followed and nofollowed) external links to the subdomain of the url.]
},
:fjf => {
:key => :fq_domain_juice_fq_domains_linking,
:name => 'Followed Subdomain Subdomains Links',
:desc => %Q[The number of subdomains with followed links to the subdomain of the url.]
},
:fjd => {
:key => :fq_domain_juice_pl_domains_linking,
:name => 'Followed Domain Subdomains Links',
:desc => %Q[The number of unique domains with followed links to the subdomain of the url.]
},
:pjid => {
:key => :pl_domain_juice_links,
:name => 'Followed Root Domain Links',
:desc => %Q[The total number of followed links (both internal and external) from a page to a domain.]
},
:ped => {
:key => :pl_domain_all_external_links,
:name => 'All Root Domain External Links',
:desc => %Q[The total number of external links (both followed and no-followed) from a page to a domain.]
},
:pjd => {
:key => :pl_domain_juice_pl_domains_linking,
:name => 'All Followed Root Domains Linking Domain',
:desc => %Q[The total number of followed root domains linking to the target's domain.]
},
:pip => {
:key => :pl_domain_ips_linking,
:name => 'IPs Linking to Domain',
:desc => %Q[The total number of unique IPs linking to the target's domain.]
},
:pjip => {
:key => :pl_domain_juice_ips_linking,
:name => 'Followed IPs Linking to Domain',
:desc => %Q[The total number of unique IPs with followed links to the target's domain.]
},
:pib => {
:key => :pl_domain_cblocks_linking,
:name => 'All Cblock Linking Domain',
:desc => %Q[The number of unique cblocks with a link to a domain.]
},
:pjb => {
:key => :pl_domain_juice_cblocks_linking,
:name => 'Followed Cblock Linking Domain',
:desc => %Q[The total number of cblock with followed links to a domain.]
},


:upa => {
:name => 'Page Authority',
:key => :page_authority,
Expand All @@ -268,8 +360,8 @@ module Constants
nil => :source,
:lu => :target,
}


LinkResponseFields = {
:t => {
:name => 'Anchor Text',
Expand Down Expand Up @@ -311,8 +403,8 @@ module Constants
LinkResponsePrefixes = {
:l => :link,
}


AnchorResponseFields = {
:t => {
:name => 'Anchor Text',
Expand Down Expand Up @@ -374,10 +466,45 @@ module Constants
:atf => :anchor,
:atu => :anchor,
}



# For values calculated from other values.
# format is [ :unknown_fraction, [ :whole_value, :known_fraction ] ]
# result is data[:unknown_fraction] = data[:whole_value] - data[:known_fraction]
# IFF both data[:whole_value] and data[:known_fraction] are present.
#
# Some calculated values are based on other calculated values,
# so be careful about the ordering of the list.
CalculationKeyMap = [
[:unfollowed_external_links, [ :all_external_links, :external_links ]],
[:unfollowed_links, [ :links, :juice_links ]],
[:juice_internal_links, [ :juice_links, :external_links ]],
[:internal_links, [ :links, :all_external_links ]],
[:unfollowed_internal_links, [ :internal_links, :juice_internal_links ]],

[:fq_domain_unfollowed_external_links, [ :fq_domain_all_external_links, :fq_domain_external_links ]],
[:fq_domain_unfollowed_links, [ :fq_domain_links, :fq_domain_juice_links ]],
[:fq_domain_juice_internal_links, [ :fq_domain_juice_links, :fq_domain_external_links ]],
[:fq_domain_internal_links, [ :fq_domain_links, :fq_domain_all_external_links ]],
[:fq_domain_unfollowed_internal_links, [ :fq_domain_internal_links, :fq_domain_juice_internal_links ]],

[:pl_domain_unfollowed_external_links, [ :pl_domain_all_external_links, :pl_domain_external_links ]],
[:pl_domain_unfollowed_links, [ :pl_domain_links, :pl_domain_juice_links ]],
[:pl_domain_juice_internal_links, [ :pl_domain_juice_links, :pl_domain_external_links ]],
[:pl_domain_internal_links, [ :pl_domain_links, :pl_domain_all_external_links ]],
[:pl_domain_unfollowed_internal_links, [ :pl_domain_internal_links, :pl_domain_juice_internal_links ]],

[:unfollowed_fq_domains_linking, [ :fq_domains_linking, :juice_fq_domains_linking ]],
[:unfollowed_pl_domains_linking, [ :pl_domains_linking, :juice_pl_domains_linking ]],
[:fq_domain_unfollowed_fq_domains_linking, [ :fq_domain_fq_domains_linking, :fq_domain_juice_fq_domains_linking ]],
[:pl_domain_unfollowed_pl_domains_linking, [ :pl_domain_pl_domains_linking, :pl_domain_juice_pl_domains_linking ]],
[:fq_domain_unfollowed_pl_domains_linking, [ :fq_domain_pl_domains_linking, :fq_domain_juice_pl_domains_linking ]],
[:unfollowed_ips_linking, [ :ips_linking, :juice_ips_linking ]],
[:unfollowed_cblocks_linking, [ :cblocks_linking, :juice_cblocks_linking ]],
[:pl_domain_unfollowed_cblocks_linking, [ :pl_domain_cblocks_linking, :pl_domain_juice_cblocks_linking ]],
[:pl_domain_unfollowed_ips_linking, [ :pl_domain_ips_linking, :pl_domain_juice_ips_linking ]],
]
ResponseFields = {}

URLResponsePrefixes.each do |prefix, subject|
URLResponseFields.each do |k,v|
v = v.dup.merge(
Expand Down Expand Up @@ -408,11 +535,11 @@ module Constants
ResponseFields[v[:source]] = v
end
end

ResponseFields.keys.each {|k| ResponseFields[ResponseFields[k][:key]] ||= ResponseFields[k] if ResponseFields[k][:key] }

LongestNameLength = ResponseFields.collect{|k,v|v[:name].length}.max
LongestKeyLength = ResponseFields.collect{|k,v|v[:key].to_s.length}.max

end
end
Loading

0 comments on commit e90eb44

Please sign in to comment.