0

I need to decide about my database architecture. The app provides files for download. I can add a counter to every file and track total events. But the client may want some analysis and ask for the most popular content over the last 30 days. In this case, the download counter would not work anymore.

Alternatively, I can create a Downloads table with date and filename fields. We can then query downloads for any period of time and any file. BUT the disadvantage is that the amount of data in the table could be HUGE. Is there a way to improve this solution?

1 Answer 1

2

Your second design is good. Keep timestamp + filename forever. Index on at least timestamp.

Is the table “too big”? No. Dashboard queries for “today” or even for this week’s downloads ignore the bulk of the table and complete quickly, due to indexing.

Create a separate reporting table that stores summary counts. A midnight cron job updates daily, weekly, and quarterly summary counts. Some dashboard queries will prefer to consult that small summary table for historic charting. Populate it with a GROUP BY. Consider using HAVING COUNT(*) > … so you ignore onesie twosie unpopular filenames. If desired, pick those up in a separate query that aggregates them as “misc”.

You are free to store monthly figures. I caution you that then you need to worry about things like how many days in that month, and also account for seasonal effects like how many weekend days in that month.

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.