Microlink API:
Browser automation
January 26, 2020 ()
A web browser is one of the most complex pieces of software, with some internal sub-systems that work together for resolving any kind of URL on the Internet, even if the content was written with HTML tables in 1992.
Microlink API is a service that provides a high-level API to control a browser instance in the simplest way possible, where the different features can be enabled or disabled using query parameters.
When we started the service, just a few things could be done. Now, we’re supporting +30 query parameters.
Just url is the only parameter that needs to be specified, but also any of the following query parameters:
Data
Enrich the response payload for detecting data from the target URL.
- audio: enables audio source detection from the target URL.
- data: gets specific content extraction from the target URL.
- filename: defines the filename asset generated.
- function: runs JavaScript code with runtime access to a headless browser.
- iframe: gets, if it's possible, the embedded representation of the target URL.
- insights: gets lighthouse performance metrics from the target URL.
- meta: gets unified medata from the target URL.
- palette: gets color information over any image present on the response data.
- pdf: gets a PDF over the target URL.
- screenshot: takes a screenshot over the target URL.
- video: enables video source detection from the target URL.
Browser
Tell the browser to act in a certain way or perform some tasks.
- adblock: enable/disable adblock over abusive third-party content over the browser page.
- animations: enable/disable CSS animations and transitions into the browser page.
- click: clicks DOM elements matching the given CSS selectors.
- codeScheme: sets the code syntax highlighting color theme to use.
- colorScheme: sets preferred browser color theme preference.
- device: emulates an specific device (viewport, user agent, dimensions, etc).
- javascript: enable/disable the javascript engine on the entire browser page.
- mediaType: changes the CSS media type of the page.
- modules: injects
<script type="module">
into the browser page. - ping: enable/disable to resolve all URLs present into the payload.
- prerender: enable/disable browser navigation.
- proxy: uses a proxy server as an intermediary during the requests.
- retry: sets the number of exponential backoff retries to perform under an unexpected browser error.
- scripts: injects
<script>
into the browser page. - scroll: scrolls to the DOM element matching the given CSS selector.
- styles: injects
<style>
into the browser page. - viewport: establishes a set of properties related with the browser visible area.
- waitForSelector: waits for a CSS selector(s) to appear in page.
- waitForTimeout: waits a quantity of time in milliseconds before processing the content of the browser page.
- waitUntil: waits browser event(s) before considering navigation succeeded.
Response
Apply some modifications over the response data for better accommodation.
- embed: embed a specific response data field respecting the content type.
- filter: filters a list of properties from the response data for bandwidth saving.
- force: forces a new fresh response data bypassing the cache layer.
- headers: customizes requests using custom HTTP headers.
- timeout: defines maximum quantity of time allowed for resolving a request.
- ttl: establishes the cache layer specifying the time-to-live before refresh a resource.
- staleTtl: establishes the cache layer specifying when a resource can be considered stale, refreshing on the background.
Join the community
All of these improvements or features are community driven: We listen to your feedback and act accordingly.
Whether you are are building a product and you need fancy previews, you’re an indie hacker or simply you like frontend stuff, come chat with us 🙂.