The main purpose of the dataflow graph is to show the history of relational operations (filter, aggregate, sort, join) and virtual tables instantiated along the way to your result datasets. The primary interface for working with your data is in the Worksheets. For example, most operations, including JOINs, will create a new virtual table for the results. You wouldn't create a new table by itself. However, you can work with any virtual table in the dataflow graph. You can add the table back to the worksheet, revert the data, drop, lock and more.
The other purpose of the dataflow graph is to describe the algorithm or model you have defined through the relational operations you performed on your data -- and save it as a batch dataflow which can be run repeatedly on new data coming through your system. Typically, this is done on a different production cluster (versus the modelling cluster most people use to design the dataflows). Batch dataflows can be run on large production datasets as scheduled in Xcalar Design or on demand via the Xcalar SDK REST API.