2019-06-07 v2.4.1 Minor updates: * fix setnumthreads in pDet * add README.md in distribution * support for Hygon Dhyana 2019-05-10 v2.4.0 New features: * fsytrf: a symmetric triangular factorization, revealign the RPM * fsyrk, fsyr2k, ftrssyr2k, ftrstr: subroutines for symmetric operations * support for AVX512 vectorization * parallelization of fgemm-rns, fsytrf, echelon forms, det, rank, etc * API for parallel routines outside of par-block (for e.g. SageMath) Improvements: * more examples * more benchmarks: fgesv * many bug fixes * improved testsuite * update to the Givaro's revamped modular fields * improved igemm * improved test coverage for SIMD * improved charpoly * improved freduce and consequently speed up most routines 2017-12-21 v2.3.2 Improvements: * minor bug fixes in the build system and with GF2 * new specialization for fgemv over recint 2017-11-22 v2.3.1 Improvements: * minor bug fixes in the build system * improved cblas/fblas detection and use 2017-11-17 v2.3.0 Improvements: * improved build system (instruction set detection, C++11 and clang compatibility, ...) * improved fttrtri (triangular matrix inverse) * increased test-suite coverage * more autotuning * clean-up and update all random matrix generator so they can be seeded. * clean-up the test-suite and enable seeding parameter * many bug fixes (and merging sage patches) New features: * new pfgemv routine (parallel matrix vector product) * new fpotrf routine (Cholesky factorization) and symmetric rand generator * new tutorials * Gauss-Jordan inverse made to work Changes in API * change signature for CharPoly (now takes a polynomial domain as input) * change the signature of ftrtrm 2016-07-30 v2.2.2 * many bug fixes ensuring a consistent support of clang, gcc-4.8 5.3 6.1 icpc on i386 x86_64, ubuntu and fedora, ppcle and osx * new SIMD detection * use pkgconfig * new feature: checkers for Freivalds based verification * improved performance of permutation application 2016-04-08 v2.2.1 * many fixes to the build system * more consistent use of flags and dependency to precompiled code * fixes all remaining issues for the integration in SageMath * numerous minor fixes to the parallel code 2016-02-23 v2.2.0 * new precompiled interface * improvements and API change for the parallel code * new random matrix generators * fix many bugs 2015-06-11 v2.1.0 Test suite and benchmark improvement : * much larger coverage * run most tests over a wide range of fields * systematic interface and options New features: * parallel PLUQ * computation of rank profiles and rank profile matrices * echelon and reduced echelon forms form both LUdivine and PLUQ * getters to the forms and the transformation matrices * igemm routine for BLAS like gemm on 64bits ints * support of Modular and ModularBalanced using igemm, to support fields of bitsize between 25 and 31 * support of Modular > for Z/pZ with p of size > 32bits (based on Givaro's RecInt multiprecision integers) * support of RNS based gaussian elimination on multiprecision fields * Paladin: DSL for parallel programming adressing OMP, TBB and Kaapi Improvements: * a lot of new sparse mat-vec product improvements * faster parallel and sequential fgemm * many bugs found and removed (no known bugs at release time) * improved helper system, with mode of operations 2014-08-08 v2.0.0 code update : * rank profile * clean namespaces * use field one, zero, etc * fix clang warnings * more blas wrappers (sger, sdot, copy, etc) * simplification of fgemm * simplify blas detection (+cflags) * easier permutation handling * improve testers * use std::min, max * many functions have API change to use last pointer argument for return * some more doc * and probably many more in 2+ years ! bugs : * correct permutations * fix fgemm, fgemv, ftrmm, ftrsm bugs * mem leaks * bugs for degenerate cases * fix bounds * and probably many more in 2+ years ! new features : * new pluq 2x2 recursive alg * leftlooking * parallel OMP fgemm, ftrmm, ftrsm * parallel KAAPI fgemm, ftrmm, ftrsm * new testers for pluq, fgemm, etc * new tester for Bini approximate formula * fadd, fsub, finit, fscal, etc * vectorisation using AVX(2) * in place schedules * new Echelon code * helper design for fgemm, fgemv, etc * template factorisation for modular/multiprecision fields * helper traits * automatic matrix field conversion (ie double -> float) * add spmv kernels * enable use of sparse MKL * parallel.h, avx and simd files * new DSL for parallelism * RNS and multiprecision fields * new const_cast, fflas_new etc functions * element_ptr in fields * use Givaro dependency (compulsory now) * new test for regressions (with tickets) * and probably many more in 2+ years ! 2011-04-15 v1.4.0 * Convert project to autotools (à la LinBox et Givaro) 2008-06-05 v1.3.3 * fix the design of specializations to modular modular * give a proper name to ModularBalanced * fix the bugs in the bound computations (Winograd recursion over the finite field was too deep) * prepare the interface for integrating compressed representation for small finite fields 2007-09-28 v1.3.2 * add routines fgetrs and fgesv (cf LAPACK), for system solving. supports rectangular, over/underdetermined systems. 2007-08-29 v1.3.1 * add the benchmark directory, for automatic benchmarking against GOTO and ATLAS BLAS. Adapted from Pascal Giorgi's benchmark system. 2007-08-28 v1.3.0 * new version of ftrmm ftrsm: ftrsm based on a multicascade algorithm reducing the number of modular reductions). Automated generation of each of the 48 specializations * several bug fixes * add regression tests: testeur_fgemm, testeur_lqup and testeur_ftrsm 2007-07-05 v1.2.2 * add a transposed version of the LQUP decomposition routine LUdivine * fix many bugs in LUdivine * new schedules for Winograd algorithm for matrix multiplication: 2 cases depending whether beta = 0 or not, taken form [Huss Ledermann & Al. 96] * add rowEchelon and ReducedRowEchelon routines + associated tests 2007-06-21 v1.2.1 * add the use of float BLAS, if the field caradinality is small enough * improve genericity: gemm can be use over any field domain (not requiring any conversion to a integral representation) * add a variant of Winograd's algorithm with less temporaries for the operation C = AxB * add ColumnEchelon and ReducedColumnEchelon routines, using an inplace algorithm, based on the LQUP decompositon of LUdivine * add routines ftrtri (replacing invL), ftrtrm. * fix bunch of memory leaks in the tests (not yet finished) 2007-03-13 v1.1.2 * change the genericity system for trsm to detect Field implementations over double (compatibility with LinBox) 2007-03-11 v1.1.1 * complete preconditioning phase for the new Charpoly algorithm * new Charpoly algorithm renamed CharpolyArithProg * add exception for failure of the LasVegas algrithm * default charpoly is now: 2 attempts to CharpolyArithProg, then LUKrylov 2007-02-27 v1.1.0 * change some naming conventions in the directories * add a LQUP routine for small dimension (LUdivine_small) and the cascading with LUdivine * put the bound computations in the same file * add dense_generator.C for the generation of random dense matrices in tests * add the new algorithm for characteristic polynomial (temporarily named frobenius) 2006-08-11 v1.0.1 * add the field implementation modular-positive.h, especially for p=2 * add a the flag 'balanced' to the finite fields modular, to switch to the apropriate bound computation (fgemm and trsm) * fix a bug in LUDivine LQUP elimination (initialisation of the permutation P for N=1 in the terminal case) * fix a bug in the determination of the number of recursive levels of Winograd Algorithm.