Tongue slightly in cheek, I'd answer: Lisp, Lisp, and Lisp.
For question number one, this is probably a controversial answer. People who are comfortable with the Microsoft development stack would, of course, consider F# to be more mainstream - after all, it's right there in your developer's tool chain, ready to be installed (or already there) on millions upon millions of PCs. The academic community is more excited about Haskell, but here also real-world usage is less than they would like to achieve. Telecom engineers swear by Erlang instead. But when taking all dialects together - Scheme for teaching and extension languages, Common Lisp for enterprise-size applications, Emacs lisp for the one true editor, Clojure for the cool young Java kids... they might just be the most-used predominantly functional language out there. (Note that programming language market share is an intensely politicized topic, and any answer you receive will be ripped to shreds by other opinions.)
Question two is easy. Unless you want to count the purely mathematical lambda calculus, Lisp was first - although it, too, was initially intended just as a mathematical modelling tool, it turned out to be easily implementable, and the rest is history. That was in the 1950s.
Question three: Common Lisp is decidedly non-pure, while Scheme more closely resembles pure axiomatic systems like Haskell, which rigorously separate side effects from computation. Again, questions about market share are always controversial, and Scheme is certainly not the purest FP language, just (I guess) the most used - even if much of this is due to cohorts of computer science freshmen. And Common Lisp is not the least-pure language either - I think all of the the various "Lisp on the JVM" tools are less pure, since leveraging the huge Java library ecosystem is one of their reasons for existence. If I were forced to guess, I'd say one of them will probably the next big winner in functional programming.