SCS on gpu in julia
Splitting Conic Solver or scs is
a well established solver for conic optimization problems.
It has bindings to julia via 
SCS.jl.
Part of each iteration of the solver is solving a system of linear equations. Scs provides some freedom in this respect. i.e. one can choose one of the following provided methods:
- Direct Solver using qdldlandamd
- Indirect Solver using conjugated gradient
- Indirect Solver on gpu using sparse conjugated gradient through CUDA
(You may also implement your own solver.)
On the julia side only the Direct and Indirect solvers are available
(since 
#95).
The main problem with providing the gpu solver is that cusparse works only with ints,
so scs has to be compiled with
DLONG=0 option,
while other software (including julia) requires
DLONG=1.
It means, that when user asks for the gpu solver we need not only cast our data appropriately, but also alternate the objects we’re passing to e.g. scs_solve, by changing the storage type, struct alignment etc. In julia we have a simple method for making this happen: parametrize the mirrored c-structs by the type of integer. Long story short: by checking out 
enh/gpu_solver branch you can enjoy the gpu solver even from julia.
Well not quite, You need to compile scs first. We start with a tiny patch to  (a patch for using scs.mkCUDA_PATH has been already merged, no need for it anymore).
Then define env variables:
- CUDA_PATHpointing to cuda installation
- JULIA_LIBRARY_PATHpointing to- ./lib/juliainside your julia installation
Finally compile scs with
make purge
make -j4 CFLAGS="-march=native" DLONG=1 USE_OPENMP=1 BLASLDFLAGS="-L$JULIA_LIBRARY_PATH -lopenblas64_" BLAS64=1 BLASSUFFIX=_64_
LD_LIBRARY_PATH=$JULIA_LIBRARY_PATH ./out/demo_socp_direct 1000 0.5 0.5 1
LD_LIBRARY_PATH=$JULIA_LIBRARY_PATH ./out/demo_socp_indirect 1000 0.5 0.5 1
make clean
# note DLONG=0 below!
make -j4 CFLAGS="-march=native" DLONG=0 USE_OPENMP=1 BLASLDFLAGS="-L$JULIA_LIBRARY_PATH -lopenblas64_" BLAS64=1 BLASSUFFIX=_64_ gpu
LD_LIBRARY_PATH=$JULIA_LIBRARY_PATH ./out/demo_socp_gpu 1000 0.5 0.5 1
make clean
make purge at the beginning and make clean in the middle to get rid of
partial products (as these were produced with DLONG=1).
In addition we also run demos to make sure scs produced working libraries.
  At this moment there should be three libraries in ./out: libscsdir, libscsindir and libscsgpuindir. (The first two are compiled with scs_int = long implied by DLONG=1, the last with scs_int = int. Assuming you’re on a sane platform.) Finally we need to set ENV["JULIA_SCS_LIBRARY_PATH"]=/path/to/source/of/scs/out and issue Pkg.build("SCS") (don’t forget to dev SCS first).
If You want to be sure, check SCS.available_solvers and observe the output of Pkg.test("SCS")!
