There are two common ways that Xcalar users represent matrices.
1. A matrix as represented by a table, using a 3-column programming structure:
i - the matrix row number
j – the matrix column number
value – the value in the i-th row and j-th column of the matrix
Xcalar's architecture elegantly parallelizes row and column calculations if represented this way. You could use Xcalar operations to create dataflows that efficiently execute matrix calculations.
- A matrix represented by one string, using characters such as "[", "]", ",", and ";" as delimiters. Xcalar's Python user-defined functions (UDFs) easily handle these operations, potentially with the assistance of Python libraries. There is a significant performance tradeoff here where you sacrifice performance of native Xcalar speed for ease of working with strings in Python.
Here are the questions to help you determine decision criteria:
How big are your matrices? Will they commonly fit inside a string (16000 chars)?
Are your matrices the entirety of the data you are working with?
The larger the matrix, and the more central matrix calculations are to your work, the more likely you will want represent it as a full table and apply Xcalar native operations to it. The smaller the matrix and less significant a part of your dataflow calculations, the more likely you will wish to use a Python library.
The Xcalar field team will be releasing Dataflows to perform some common matrix calculations. I do not have an ETA for that. I am glad to post the ETA when I have one.
Do you have any other questions about this?