There's a lot of factors that go into determining whether a particular shadowing technique is really "the best", and those factors are usually different for every project. Otherwise everyone would agree, and we would be using the same thing in every game. It's going to be up to you to weigh the pros and cons of different approaches, and try to decide what's actually the best fit for your project and target hardware.
In general, "standard" depth buffer shadow maps with various forms of PCF are still the most popular choice for games. They're cheap to render, easy to setup, they're supported on a very wide range of hardware, and they have well understood flaws (mostly related to filtering and biasing).
VSM's primary advantage over standard shadow maps is that they're pre-filterable. Unlike depth buffer shadow maps, where you to perform the depth comparison before filtering, you can filter VSM's as soon as you have them in two-component VSM format. This opens the door to things like MSAA, mipmaps, and separable blur passes. This can not only give you better quality, but can also possibly make things cheaper relative to standard shadow maps (this is especially true if you cache your shadow maps across frames). Their other main advantage is that they are much easier to bias compared to standard shadow maps. With VSM it's possible to pick a single "magic" value that will work across a wide range of conditions, whereas with standard shadow maps you typically need artists-authored offsets combined with complex techniques that adjust the bias based on the sample position and receiver slope. The main disadvantages are that light bleeding (which I'm sure that you're already familiar with), and the need to convert from "standard" depth into the variance format. Light bleeding can be reduced with a few tricks, but it will always be present to a certain extent for certain occluder/receiver configurations. The conversion requires either using a pixel shader when rendering to the shadow map, or having a conversion step after you've finished rendering to a depth buffer. The conversion can possibly be rolled into an MSAA resolve or the filtering step, if you use those things. You might see additional memory storage brought up as a concern for VSM, but in practice I've found that using an R16G16_UNORM format provides completely adequate precision, with the same footprint as a 32-bit depth buffer.
EVSM is much like VSM, except it attempts to address light bleeding by using an exponential warp. This warp can be very effective at reducing or eliminating light bleeding in most cases, but it won't fix all cases. Since it's an exponential warp you're pretty much required to use a floating-point format, which can quickly leave you with precision problems if you're not careful. 32-bit floats will give the best results, but you can get away with 16-bit if you're very careful about restricting your depth ranges and also use a more conservative warping factor. However you really need to include the negative warp for best results, so at best you're looking at R16G16B16A16_FLOAT which is double the footprint of a 32-bit depth buffer or a VSM texture. For maximum quality, you'll want 32-bit floating point which means 4x the footprint of a standard shadow map.
Having done a lot of work with EVSM myself, I will say that it's definitely a viable approach for a higher-spec hardware that has good support for floating-point formats (for instance DX11 PC video cards, or current-gen consoles). If you're targeting more modest hardware, then you're probably better off sticking to standard shadow maps or VSM. I don't currently know of any shipping games or games in development that use EVSM, with the exception of the game that I'm working on (The Order: 1886). So when it comes out, you can have a look and judge the quality for yourself.