Theoretically in the most fantastical of worlds you could build software to handle a dataset you don't have. That's just hypothetical, though. That's not reality. That's not what happens in software development. If you want to develop something quickly, profitably, on time, and to please your clients, chances are you're going to have access to a fair deal of the actual raw data.
The claims that AggregateIQ didn't have access to raw datasets, I believe, are disproven simply by what's present in the GitLab files, because there are user names, passwords, network locations to what are labelled actual databases, not fake databases.
I believe it's highly unlikely that AggregateIQ didn't have access to very large raw datasets.