The mathematics is completely out on both counts as there would be far more variables applied.
"Data Mining" has been about for a number of years, most commonly you'll see it in day-to-day operations through the use of a search engine on the internet. The way a search engine works for the most part is first it has to collect a "view" of a website through the use of a "Robot" program. This information is then housed in a server as a cache which then has an "Agent" program iterate through line by line to compile information on what that cache content contains. Depending on how that Agent is written defines what information is looked for (through "filters") and how that information is that stored in a "fast indexing" format within a database. (The cache's can then be purged, although in recent years companies have a tendency to keep a cache to show what's available if something is down and also likely to show what content was there in the cases of litigation)
A "spider" is a program that attempts to follow all the links within the cached content and identify other pages for the "Robots" to be sent to, it doesn't deal with the pages content just the site structuring and source locations.
Then we actually get to the part about the actual "Search Engine", which itself attempts to pair "filters" (as provided by words separated by spaces, prefixed with -/+ or surrounded by comments) The fast indexing database is searched at that time to find the link to the cached page and of course URL of the actual page.
Additional algorithms can then be applied to find "similarities" in the number of instances of those particular words used in the filter through all the sites listings the agent's have been through.
The same can be said about any "Data Mining" operation, it's split down into procedural components to make it easy to manage and no algorithm alone would be capable of creating "Matches" without enough data being presented. (Imagine you want an algorithm to look for things out of place, how can you do so if you haven't got enough information to identify what "Normal" looks like? This is why the number of data records collected and used by the various governments.)
The concern people have isn't the fact they are being watched by an automated observer following predefined algorithmic patterns, the concern itself is who makes those patterns and why. After all it might be right in the current climate to look for pattern match involving a particular archetype of person (i.e. a would be terrorist), however If the powers that be wanted to hunt people down at a Eugenics level (Singling out the poor, the disabled or those with socio-dysfunctional family ties), it would be easy to change it to operate like that. (Which is reminscent of what the Nazi movement was all about.)
"Data Mining" has been about for a number of years, most commonly you'll see it in day-to-day operations through the use of a search engine on the internet. The way a search engine works for the most part is first it has to collect a "view" of a website through the use of a "Robot" program. This information is then housed in a server as a cache which then has an "Agent" program iterate through line by line to compile information on what that cache content contains. Depending on how that Agent is written defines what information is looked for (through "filters") and how that information is that stored in a "fast indexing" format within a database. (The cache's can then be purged, although in recent years companies have a tendency to keep a cache to show what's available if something is down and also likely to show what content was there in the cases of litigation)
A "spider" is a program that attempts to follow all the links within the cached content and identify other pages for the "Robots" to be sent to, it doesn't deal with the pages content just the site structuring and source locations.
Then we actually get to the part about the actual "Search Engine", which itself attempts to pair "filters" (as provided by words separated by spaces, prefixed with -/+ or surrounded by comments) The fast indexing database is searched at that time to find the link to the cached page and of course URL of the actual page.
Additional algorithms can then be applied to find "similarities" in the number of instances of those particular words used in the filter through all the sites listings the agent's have been through.
The same can be said about any "Data Mining" operation, it's split down into procedural components to make it easy to manage and no algorithm alone would be capable of creating "Matches" without enough data being presented. (Imagine you want an algorithm to look for things out of place, how can you do so if you haven't got enough information to identify what "Normal" looks like? This is why the number of data records collected and used by the various governments.)
The concern people have isn't the fact they are being watched by an automated observer following predefined algorithmic patterns, the concern itself is who makes those patterns and why. After all it might be right in the current climate to look for pattern match involving a particular archetype of person (i.e. a would be terrorist), however If the powers that be wanted to hunt people down at a Eugenics level (Singling out the poor, the disabled or those with socio-dysfunctional family ties), it would be easy to change it to operate like that. (Which is reminscent of what the Nazi movement was all about.)