exploration of a bunch of different data
If the size or nature of data is substantially fluctuating, this could have an effect on the time it takes to process it. For instance, Walmart store sales on a weekend may be 50% higher than a weekday. So not all datasets are the same. Additionally, you need to check if there is any skew in the data that may translates to an imbalance in the way the processing occurs on the cluster. For more details on skew, a separate post will be required.
Can you send a sample dataset where you see this different data.
I use MAP a lot, and even that seems perform differently at times.
It's not the operations that perform fast or slow, but it's what they do. For instance a Map that converts a data type can be incredibly fast, but a Map that executes a user defined function may run slowly, depending upon how the UDF is written.
Please send further details on what kind of Map operations you are performing.
I've seen some dials and things which tell me about the system, but I don't really understand them and sometimes I don't see much activity in the CPU.
A couple of tips here, but this topic may be bigger, and may require you to research this further. If you are watching CPU, let me know if you are seeing uniformly low CPU on all the nodes or only some nodes of the cluster. Let me know if this is during Import of a dataset or subsequently during processing.
How can I figure out what's happening or why it's fast sometimes and slow at other times?
Look at the answer to the first quesiton. The volume of data you operate on can be one factor. You also may want to look at background usage. How many other users use the system, and what are they doing, when you are using the system? You may also want to check the network dials, to see if there is any irregular or excessive network activity. Additionally also looking at memory usage in terms of the system swapping. Are there other things running on your cluster? Finally you can also take a look at I/O. For this you can even use a tool like iotop to monitor if there is a lot of I/O contention.