As you might have read in several of my blog posts (like: 5 reasons to stop with WordPress, or Medium stopped offering custom domains), I could not really get used to some blog frameworks. And by switching a lot my sites tend to fluctuate in Google Search a lot.
So every now and then I use Screaming Frog to test my sites and see what I can do to improve them. Now, I’m not using the GUI that Screaming Frog has, but I use the command line.
With this command line I export the Bulk Export and the Export Tabs and process them with Python to an actually readable Word Document. I build that in the past just for fun.
The ‘Bulk Export’ in the Screaming Frog GUI is located under the top level menu and allows bulk exporting of all data. You can export all instances of a link found in a crawl via the ‘All inlinks’ option, or export all inlinks to URLs with specific status codes such as 2XX, 3XX, 4XX or 5XX responses.
As I’m using it in Python I wanted to know the names of all the Bulk Export and the filenames that Screaming Frog would spit out.
And Guess what…. there is no complete list! Not even the Screaming Frog people could tell me what Bulk Export items there are.
I would really like an over view (for command line) with -bulk-export and -export-tabs.
We don’t have one I am afraid.
We may look at writing them all out at some stage, but we don’t have one now, or one I can offer you at a set timescale unfortunately.
Python OCR the Bulk Export tab images
So….. no way to get the Screaming Frog Bulk Export List. I can tell you I did not want to type 60 Bulk Export names and file names.
What I could do was to make a copy of the names in the GUI overview and retrieve the text. So, I did some coding in python to actually get the text from the overview. (yes… I might have as well typed the names myself. But I just love coding 🙂
So I made screenshots, saved the screen shots in a folder and ran my python script
Screaming Frog Bulk Export List
So here is the list. Where name is the actual name you can use in the command line option from screaming frog “–bulk-export” and file is the filename coming out of the command line processing.
YML output can be found here.
name: All Inlinks
name: All Outlinks
name: Queued URLs
name: All Anchor Text
name: All Images
name: All Page Source
name: External Links
name: Response Codes:Blocked by Robots.txt Inlinks
name: Response Codes:Blocked Resource Inlinks
name: Response Codes:No Response Inlinks
name: Response Codes:Success (2xx) Inlinks
name: Response Codes:Redirection (3xx) Inlinks
name: Response Codes:Redirection (Meta Refresh) Inlinks
name: Response Codes:Client Error (4xx) Inlinks
name: Response Codes:Server Error (5xx) Inlinks
name: Directives:Index Inlinks
name: Directives:Noindex Inlinks
name: Directives:Follow Inlinks
name: Directives:Nofollow Inlinks
name: Directives:None Inlinks
name: Directives:NoArchive Inlinks
name: Directives:NoSnippet Inlinks
name: Directives:Max-Snippet Inlinks
name: Directives:Max-Image-Preview Inlinks
name: Directives:Max-Vnameeo-Preview Inlinks
name: Directives:NoODP Inlinks
name: Directives:NoYDIR Inlinks
name: Directives:NoImageIndex Inlinks
name: Directives:NoTranslate Inlinks
name: Directives:Unavailable_After Inlinks
name: Directives:Refresh Inlinks
name: Canonicals:Contains Canonical Inlinks
name: Canonicals:Self Referencing Inlinks
name: Canonicals:Canonicalised Inlinks
name: Canonicals:Missing Inlinks
name: Canonicals:Multiple Inlinks
name: Canonicals:Non-Indexable Canonical Inlinks
name: AMP:All Inlinks
name: AMP:Non-200 Response Inlinks
name: AMP:Non-Confirming Canonical Inlinks
name: AMP:Missing Non-AMP Canonical Inlinks
name: AMP:Non-Indexable Canonical Inlinks
name: AMP:Indexable Inlinks
name: AMP:Non-Indexable Inlinks
name: Structured Data:Contains Structured Data
name: Structured Data:Valnameation Errors
name: Structured Data:Valnameation Warnings
name: Structured Data:JSON-LD URLs
name: Structured Data:Microdata URLs
name: Structured Data:RDFa URLs
name: Images:Images Missing Alt Text Inlinks
name: Images:Images over X KB Inlinks
name: Sitemaps:URLs in Sitemap Inlinks
name: Sitemaps:Orphan URLs Inlinks
name: Sitemaps:Non-Indexable URLs in Sitemap Inlinks
name: Sitemaps:URLs in Multiple Sitemaps Inlinks
name: Custom Search:All Inlinks
name: Custom Extraction:All Inlinks
Like it? Don’t forget to buy me a Coffee!!