↖️ Show all posts

Parallel Http Requests with HTTPoison in Elixir

Ever wondered how to send multiple http requests using multiple processes with Elixir? Let’s use Elixir glorious Tasks to map your desired website.

defmodule Looter do
  def grabber(urls \\ ["http://www.simon-neutert.de", "http://www.simon-neutert.de/posts"]) do
    Enum.map(urls, fn(url) -> Task.async(fn -> Looter.digger(url) end) end)
    |> Enum.map(fn(task) -> Task.await(task, 145000) end) # 145000 == Timeout in milliseconds
  end
  def digger(url) do
    %HTTPoison.Response{body: body, status_code: status_code} = HTTPoison.get!(url)
    case status_code do
      200 ->
        {_, _, title} = List.first(Floki.find(body, "title"))
        IO.puts title
        {:ok, title}
      _ ->
        IO.puts "Error #{status_code}"
        {:error, "Error #{status_code}"}
    end
  end
end

Bonus: Experiment with Enum.chunk(), so you can setup pools of workers and limit the amount of processes created.

The key of what makes this so efficient is:

Enum.map(urls, fn(url) -> Task.async(fn -> Looter.digger(url) end) end)
|> Enum.map(fn(task) -> Task.await(task, 145000) end)

Read more about parallel maps on elixir-recipies and holyxiaoxin article is well worth reading, too.

Not fast enough? Don’t mind using Ruby? I came up with a solution using EM-Synchrony and Nokogiri that is up to 4 times faster using only around 25 concurrent connections.


⬅️ Read previous Read next ➡️