Big Data vs. Data Privacy

Currently I’m working on a big data / data ware house project. So basically what we want to achieve is collect data from different data sources (databases, tracking, …) at a central place to be able to extract certain data and do analysis on it.

Basically the architecture looks like the following:

The data is exported as a snapshot from the source system and saved to a snapshot file to have historical data available e.g. for data scientists.
From there it is but into a database that reflects the current state of the data.
Finally there are different applications accessing the current state of data, exporting it, doing aggregations, displaying charts, …

Of course we also talk in such cases about user specific data. So also data privacy matters a lot.

We were discussing a long time about the use case if user data has to be deleted from the system and how to do it.
Our finding was that the hardest part are the snap shots where for example you have a users data in different files that were collected of a year or longer because basically those snap shots are exports of the source database.
For the current view on the data and the analytics applications it is quite easy to delete and/or overwrite the data because normally you store e.g. a user’s email only once in such a system but for the snap shots you might have multiple instances of it in different snap shots. That would mean in a deletion scenario you had to go through a lot of files check for a specific user entry and delete / overwrite it.

So we came up with an idea of encrypting the snapshots: So the user’s data in each snapshot is encrypted for each user with its individual key. That key is stored in a central place. Each user will have only one key so there is e.g. a database table that contains a mapping of user id and the key.
During the creation of the snapshot the data is encrypted and saved and before importing into the current state database it is decrypted and the decrypted version is deleted right after the process.

In case user data has to be deleted deletion in current state and statistics applications will be performed. Additionally the encryption / decryption key of that specific user will be deleted so his data will still be in the snapshots but it is (not easily) possible to access the data.

At the moment this is more an idea and has not be implemented yet but we think it might be a good approach. So I will keep you updated about a proof of this concept.

Feedback and questions are appreciated as always! 😉

GD Star Rating

2-factor authentication for wordpress

As a software engineer normally you are paranoid about security. Therefore I was really worried about somebody attacking my wordpress.
So I was doing some research about improving security.
First attempt was better password plus Apache basic auth. But still there is the possibility to be brute forced and the I saw that there is a plugin to integrate 2-factor authentication via Google Authenticatior.

To be able to use it of course you need the Google Autenticatior App on your smartphone (Android:
So install it and set it up. You will be asked to visit a google website, login, scan a QR code and enter the first verification code to be able to use 2-factor authentication also for your google account. This is not mandatory. Anyway now you are done on your mobile.
If you want the app on multiple devices with the same secret key:

Next step is to login to your wordpress and install and activate the Google Authenticator plugin (
Now switch to user and view details of the user that should use the 2-factor authentication.
Open the app on your mobile and add that account either via QR code or entering secret key manually. Relaxed mode should be enabled to compensate time differences.
Now save the user and if you are brave logout. 😉
The next time you try to login you will be asked for your password and the verification code.

Hopes that helps to make your wordpress a little more secure.
Perhaps also saving the QR code or the secret key makes sense if your phone breaks.

GD Star Rating


I participated a webinar about security of webapps. Here is a short summary about some of the things they covered:

  • If a there is a multi-stage (upload) process of user data (e.g. images) bind the data to user-session and delete not finished or canceled uploads when user session expires.
  • Use httpOnly and secure attributes of cookies. HttpOnly protects cookie from being accessed via JS and can be used if you need that cookie only on server side, e.g. session information
  • Change session id if a user logs in to avoid session hijacking
  • To avoid clickjacking (displaying your website in a foreign iframe and abusing the clicks of your user) use x-frame-options and framebusting JS
  • Never save PHP / Java serialized data at the user (e.g. cookie) and unserialize again. Use JSON instead.

In the end all those things are quite obvious but I think that is good to repeat even obvious things from time to time. 😉

GD Star Rating