Hi Aakash,
The dashboard application has a tendency to keep a lot of historical data. The groomer usually cleans most of this up, but you might want to get a rough distribution of how much data each app is using. You can approximate this with the following CQL query:
SELECT COUNT(*) FROM "ENTITIES__"
WHERE token(key) > token(textAsBlob('appscaledashboard'))
AND token(key) < token(textAsBlob('appscaledashboard\x01'));
The above query will show the number of entity rows that the dashboard application is using. You can do the same for your other apps to see which is using the most data.
If you have any "last_modified" timestamp fields on your entities, that would be the easiest way to delete older entities. Your application can query and delete them as needed.
Another thing you may consider is a Cassandra compaction if it has not been run recently. When data is deleted from Cassandra, it still exists on disk until a compaction is run. Be aware that your deployment may suffer performance issues while the compaction is running.
Which version of AppScale are you using? The reason I ask is because 3.0 drops the journal table, which typically takes a very large portion of the disk space used by an AppScale deployment. You may not be ready to upgrade yet, particularly because it requires a conversion process that may take awhile on a 250GB cluster, but it's something to keep in mind.
Also, how many nodes do you have in your Cassandra cluster, and what is the replication factor?
To answer your question about a setting for auto-deleting application data, AppScale does not have that feature built-in (other than what the groomer does for the dashboard data, that is). Cassandra does store metadata about when data was written (which can be accessed with the writetime CQL function), so it's theoretically possible to achieve what you are asking for, but it would require a bit of work. You'd want to delete the corresponding entries in the index tables along with the entity data.
-Chris