Written

WEEK 8 REVIEW

The thesis format will be:
Intro
Literature Review 1 (Domain 1)
Literature Review 2 (Domain 2)
Discussion (Intersect)
Conclusion


The domains need to focus on a particular area only, so far this is the thinking:
domain 1 is Human and Privacy.
domain 2 is Machine and Identity.


The latest attempt at defining the question is as follows:
In a public space filled with IoT devices, can privacy play a role in protecting a person from machine derived identification?

So, given all the above can we need to do some things:
Firstly refine the biblio to reflect the 2, and only 2, problem domains.

Start writing some notes about them that are suitable at least to be able to scan the meaning of a given article but could ultimately become the basis of the actual lit review.

Think and write about the currently mysterious Intersecting area; which is also where the project, artifact, prototype, archetype thing will exist - if its still required or useful.

Human and Privacy:
The human part is an obvious one and I do not want toget into the definition of what a human is. The privacy one however is more complex and more suited to analysis in relation to this thesis. There are some historical definitions that are useful to note - 7 Principles of Data Protection, government data laws, etc. Social media needs to be corralled as soon as possible as it is too big and more of a parallel topic. It has useful research into the prompting of privacy concerns that is has made, also that it is a continuation of the Quantified Self idea where the human user provides the information (data) as a knowing participant. This also will apply to things like spying apps, etc and anything that has the user being a willing provider of information. This thesis has to be firmly rooted in the ideas of the IoT devices that collect the information without the user being aware. The consequence of a person being aware that a public space has IoT in it, somewhere, will probably have the Chilling Effect so this would need to be examined. A possible precursor for this would be the use of CCTV.

Also to this idea of privacy is the need to provide research into it that considers the idea of provacy as a protective law etc, it's relationship to identity and how privacy protects identity. So a Noun and a Verb.

Machine and Identity:
The use of machines, pervasive, ubiquitous computing and the algorithms needs to be defined as being non-human in execution beyond the initial development of the technology and techniques employed that are done so in isolation from the use and possibly as a technical exercise, such as "what can we make with A and B and C, what can these things do that is novel and new?".

The role of Big Data specifically needs to be refined, else it is going to end up being about the data. The data is the result of machines collecting things in public, so in this case it must be considered a Noun. We need to concern ourselves with finding a Verb, as we already have a suitable Noun in the human/privacy combination of domain 1. The collection of information in the context of Big Data has prompted many responses that have been researched so as far as that goes it could be considered to be aligned with the relationship that Social Media has to domain 1 above.

The machine is the IoT devices and beacons and it is that they are operating at a sensory level beyond human experience and that they are operating in ways that are far less obvious than a CCTV.

The identity part comes from the machines' attempt to garner the person's identity from some data. That the act of walking through a public space may lead to a thorough identification of a person that results on the environment responding to that person in ways that would not like. This part here needs greater thought as this is in some ways the motivation for the thesis. So what is it about the identification of a person that warrents investigation here? The EPIC request for info from the US gov is useful for outlining some issues. Mainly it would seem that rights and politics are the only real areas that would warrent attention here.

The role of privacy has to be considered as having various aspects that won't be covered in this thesis beyond mentioning that they exist. Political reasoning and meaning for privacy can be noted as having responsibility for maintaining and promoting a free and democratic society, that privacy can ensure free association and beliefs. This is of course requiring that there needs to be some entity that can respond badly to people hence the need for privacy. This meandering paragraph should utlimately be saying that governments need to respect the privacy of its citizens.

The area of interest for this thesis should be contained by the right to retain privacy of identity as it may be used by any entities, such as government or commercial, for reasons that are undesired by that person. The storage of data for long term enables a retroactive determination of a given person's identity as well, so the idea of currency in human agency (is that a thing?) is questioned. If a person does something at one time does that mean they are to be forever known for and by that one thing?

The effect of having an IoT public space could be that people no longer consider that a free and public space. The idea is , at the back of the mind, that this is a monitored space with unknown entities collecting enough specific data to allow a detemination of identity. Mainly this needs to be approached from the perspective of (audio) beacons that can instigate an interaction with a person's mobile phone (possibly considered an extention of the self) and communicate with it and invoke further communication via the internet. Instead of the more passive, watchful CCTV like state of surveillance normally considere a part of contemporary life, this new addition makes it more invasive, physically and materially (them words?). This last point needs to be and remain as the focus otherwise this will soon become unmanageable. This is a small part of a much larger story.



IDENTITY AND PRIVACY

In this thesis the basic unit of a person is their identity. This entity is a complex mix of actual physical being, history, relationships, situation (present) and possible future(s). Possible futures refer to the "possible" directions a person may take depnedant on the now, their abilities, desires and, for want of a better word, opportunity. An ability to progress in a particular direction may not present itself as possible due to external constraints be they other people, position and even time.

Already there are external constraints on identity, just because you want to be a A doesn't mean you can... and from this we can understand that identity, who you are, may not be who you want to be or could be. An awareness of this reality is part of what constitutes identity, you are the person that would like to do A but cannot because of B. These aspects may be something communicated to others or they may be retained within the confines of the mind. The decision to reveal any aspect of identity should remain something that we are at least cognisant of if not actively in control of. Theories abound with attempts to determine and explain why this is of importance but for this discussion it is best to limit the why to being within the context of digitised bureaucracy.

The identity of who we think we are and who we think someone else thinks we are fragile, changeable and open to interpretation. This identity is also inescapable. It can be manipulated internally, knowingly or otherwise, and it could be altered if by convincing those external that this is indeed the new way to think of the old identity.

Privacy could be considered to be a metaphorical and physical barrier between the identity and the world beyond it. Privacy is an action or process (verb) as well as a name (noun) for what exists between the individual and the outside world. It is a choice to be enacted. It would be sensible to think of this privacy as being something that follows us wherever we go, it could be thought of as a specific part of our very being. If it is always present in the same temporal-space then we can be aware of its state and adjust it if need be. As much as we cannot control someone elses idea of our identity, even though we may attempt to influence it, we cannot control external renditions of our behaviour under the gaze of internet connected sensors.

There may be an uneasiness that sits at the back of the mind for anyone that walks down a street that is known to have CCTV cameras positioned along it. This is not due to the simplisitic political adage that "if you have done nothing wrong, you have nothing to fear" - which in itself is a frightening thing to say to a population. Rather it is due to the uneasiness of being observed no matter what the situation. That at the other extreme end of this "communication" will be, historically, an unidentified person observing for something. This observation is now increasingly being done by computer vision and behavioural algorithms that attempt to classify an instance of "features" and compare it to a model of "target" archetypes.

Perhaps we have done nothing wrong and therefore have nothing to fear but that concept of nothing wrong is based upon our own understanding of our own behaviour. Perhaps that behaviour is also based upon an internal understanding that makes what could appear to be unacceptable, acceptable. That wallet lying on the ground could be picked up and placed in the pocket with the full understanding that it will be returned to the owner, untouched, at a later time. A machine observing behaviours may not allow for that possibility in its programming because how would it know that is something that would be done. Unless a large enough dataset has been built up that makes such a mechanical deduction reasonable.

As Michael Flanders once joked about aeroplanes - "But I'm never worried, because I know that every tiny part of this great machine is a miracle of modern engineering. Then the ashtray falls off..."



THE BIG NET IN THE SKY

Gathering data (points, sets) and then storing them for processing and attempting to understand them could probably be traced back to the first stock inventory, or even the first money (counting?) that attempted to have a unique entity that would represent something else of a value. This probably also corresponds to communication theories where they talk about signs, signifiers and signified. We could just be happy enough to understand that anything external to us has a representation in the mind that is an analogy or a simplification. An elephant is too big and complex to actually be contained in the mind so we use the word "elephant" and all the other mental associations that go with the concept and meaning of an elephant.

These associations can be wide and varied. They can include responses such as "run away". Whatever they are, they are generated by ourselves, whoever we are. Data in the internet age takes this meaning and refines it to a digital representation of something, this could be most simply describe as the boolean logic of 1 or 0, on or off, true or false, yes or no. This binary limitation is a reflection on the properties that constitute a computer at the machine/silicone level. But from this basis of the smallest atomic computing unit of a 1 or a 0 we can then construct numbers. These numbers are meaningful only in the context of relationships to other numbers so that these relations become part computer logic such as AND, OR, NOT. The processing possible seems limited at this scale but when increased via high level programming languages complexity creates possibility.

This digitisation inherits from a "ledger-book" style of bureaucracy where a something is reduced to a specific symbolic entity that is written into a ledger and can be referred to at a later date when information or meaning about this something needs to be recalled. A useful technique if the somethings number more than can be reliably memorised and recalled by a single human and also useful when that information about a something is required by other interested parties.

Digitisation, storage, processing and transmission via contemporary computer based machines have increased the capabilities of all of these areas. Added to this is an ability to infer aspects that are otherwise not deliberately sought or incapable of being sought for one reason or another. Inference and its close relative association can be achieved by collecting enough information in the surrounding area that a hidden aspect could be revealed. Analogies are not relevant in this instance as the very concept needs to retain its complexity.

We could discuss computer vision and its use of background subtraction to turn the new into a trackable "blob" but limiting ourselves to a vector array of pixel colour values would do a disservice to the power now present in networked computers using state-of-the-art algorithms to learn how to deduce information from massive unlabelled datasets that may be of interest, somehow, somewhen. The datasets are not singular but rather can be aggregated or collected from various sources that have as their raison d'être seemingly innocuous data gathering procedures, for instance a household thermostat that records changes to external and internal temperatures could indicate that a person is in the house.



INTERFACES

The mobile phone as computer has changed the way that person to person communication can function. The telephone is the central component of these devices and enables direct person to person exchange. Prior to the mobile phone a telephone was physically bound to a specific address, be it a home or a business. Within a business a PABX is used to sub divide a building into departments and people so from this perspective it is possible to speak directly to a person of interest.

For the general public, the telephone is the household and prior to recording machines (answering machines) being added a phone call to a household could "ring out" with no answer - deliberately or otherwise. With the advent of the mobile phone they are always on which requires a different behaviour and understanding for their use. Combined with the miniturisation of computers that have now become an integral part of the phone, we can now use it for non-verbal communications. Whereas the copper line was bound to the electrical analog of the human voice, now the HSDPA/HSUPA of today provides access for the computerised phone to digitise the voice (typically using an adaptive multi-rate audio codec, RFC 4867) and transmit a human conversation in a machine-to-machine language. That phone can also transmit and receive data in any transmittable format: visual, audio, written...

This idea of an always-on device that is bound to the physical space of the user is changing the way other people can respond to that user. Mobile phones (usually referred to as "smart phones") are not only senders of information to the internet, they are also receivers of information. This usually takes place in the form of either an application or the operating system itself taking advantage of teh always connected facet of the phone to send and receive information from somewhere on the internet. This typically takes place in the background, behind the immediate user interface level in such a way that the phone can appear to "wake-up" to inform the user of something: a message has been received from a friend, an update to a software, a notification about some event, etc.

One of the side products of a society based heavily on electrical energy usage is its inefficient methods of generation, transmission and storage. Electrical energy for a phone is stored in a small chemical battery that deteriorates over a period of 1-2 years typical use. This battery has two main draws on its power which are the screen and the radios that transmit and receive signals. The onboard processor adds a considerable drain on energy when it is used. This practical reality (coupled with a small screen size, constant connection) has encouraged processing and storage of data to be carried out off-device. Matching this idea is entrepreneurialist capitalisms tendency to "disrupt" by providing a service for a process that was previously carried out by the user (IaaS, SaaS, PaaS, ITaaS and The Cloud).

Alongside the transition of the phone to a smart phone is the relatively recent arrival of the beacon and its derivative the audio beacon. The beacon functions by emit a short low power bluetooth radio wave emission that contains a small payload of data. This data can be understood by the mobile phone and utilised as a way of communicating to somewhere on the internet that the user is within a specific proximity to the beacon. This is a continuation of the method of GPS based location awareness in that the user is able to switch this ability on or off. With the addition of near ultra-high frequency audio transmissions, beyond the range of typical human hearing, the mobile phone is capable of receiving and acting upon this transmission without user awareness or response. This ability is coupled with a computer software methodology that ensures that frameworks or libraries (Software Development Kits) of code are available to be "plugged in" to applications means that a software library developed for this explicit purpose can be included in any other software application possible. This serves as a further leayer of obfuscation for the user to come across.

These beacons are primarily developed under the banner of the "internet of things" where disruptive technologies and implementations are funded in the hope that they provide suitable investment capital returns and are therefore considered primarily to be a commercial product. Developments and techniques used by governmental bodies (NSA, GCHQ, et al) will be considered to be beyond the scope of this thesis.



SCENARIO

An attempt to describe things - not an attempt to be precise or conclusive.

The archetypal use-case is of a person carrying a mobile phone into/through a public space that has a beacon (specifically an audio beacon). This mobile phone is used by the person (referred to as the end-user) as an extension of their being. The use it to communicate to people, participate in society and to find out information relevant to them as a means to determine what it is that they want to do. This last part can be considered to be part of conscious choice, human will or human agency in that information from and about the present is compared or contrasted to an internal "world model". This comparison then can be the instigator of a choice which for the end-user may simply be a matter of where in this shopping centre do I go to get some lunch.

Trying to come to some sensible assignment of Privacy in this scenario is what can be considered to be causing concern amongst privacy advocates and other interested parties. Is this internal thought process the specific definition of privacy that can be determined as all that remains of the person-world divide?

The mobile phone, social media and any activity that takes place with technology connected to the internet can be considered to be part of what and how a person constructs and enacts their identity online. This concept of online versus offline is one that is increasingly becoming meaningless as a divide especially when multiple objects in the home, multiple devices in public spaces are all connected to the internet and are sending information to somewhere. This information can be specifically about the device itself, for instance the drink vending machine in connected to its "home" server so that it may send data about its levels of stock.

But it can also be devices that send data about the spaces that people inhabit. A sensor in the home that sends information about whether lights are turned on or not can also be inadvertently sending information that leads to an assumption that someone is in the home. If the light is on in the kitchen and it is the evening, then maybe we can assume that someone is cooking dinner. If the fridge is telling its home server that it has scanned the levels of cheese remaining since being used a few minutes ago and that the stove top is telling its home server that a saucepan of water is being boiled then maybe we can make assumptions as to what is being cooked and the amount that is being cooked. This may lead to assumptions as to how many people are in the house. By querying the home router with its long term leased IP address, how many clients have been assigned a DHCP lease, we can come to the conclusion that there is an extra device connected not normally found in this house. This new device is identified as an Android 4.4 device, it is currently using the facebook app and by querying its user profile we have found a name for this new person.

Who are these people? Why are they in the house at the same time? Do we care? Lets make some assumptions based upon what information we already have about these people. The person that lives in the house is well known to us, they have regular and highly accessible data sets that describe their behaviours at nearly all times of the day, every day. This new person, however, has never appeared in this particular data set, nor do we have much information about them other than a small array derived from a hardly used facebook account. This account is not very useful in determining anything about this person. Lets look wider by cross-referencing the name in any database that we have access to. The name they are using returns a null, it would appear that the name they use on their facebook account is a nom de plume. Lets ask the phone to delve deeper. One of the apps (a game) installed on the phone is connected to a secondary advertising network that we have access to. This app has permission to query all the installed apps on the phone, it has permission to query any user accounts set up on the phone. We now have the Google account name and the name of several other communication/chat apps. We also have the IMEI number that uniquely identified this mobile phone on the network. We also now have the logs of all the incoming and outgoing SMS messages and we now have the phone call logs. We have read/wriite access to the storage and the photos folder. We have what appears to be a "selfie".

Lets now draw a map of all these points of data and see if we can find a more accurate representation of who this is. The SMS logs to "Mum" appear frequently so we can start there. "Mum" has a phone number and a location. And accounts. And a history. And children. And their names. Hello user. You like macaroni and cheese. Here's an advert we have. Specifically for you. Sent to your phone. Now.



USE CASE

diagram link

The three main components are the Human, Interface and Machine which will be broken down as follows:

The Human has an Identity (the construction of this is irrelevant at this point, the only important aspect is the assumption from the Human is that it is their Identity), and as an interface to this is Privacy. Privacy can be a behaviour as well as external frameworks that help construct and/or strengthen. Privacy could also be thought f as being the body that houses the consciousness, or the body that houses the mind that constructs the identity. A non-corporate entity.

The Interface consists of the devices (technology) that are carried or installed within public spaces. The Human has a mobile phone (full of sensory inputs) and the IoT (better word?) has Beacons that are made up of sensors and connections. Here the divide between the entities blurs.

The Machine is that part where networked computers at large scale are programmed with complex algorithms designed to categorise and store data sets based upon any sensory input recieved. These algorithms are also responsible for identification of traits, or features of interest, that can then be analysed for pattern recognition utilised by behavioural prediction. Classifications, or output, can then be constructed and used to create/manipulate a chosen Identity. A corporate entity.



CLASS DIAGRAM

diagram link

This is an introduction to the construction of a Class Diagram based upon packages derived from collections of research papers. So far three (plus a tentative a fourth) packages have been sorted as being of an overall importance and may be useful as major categories of definitions. These are Data, Privacy and Behaviour.

Ultimately Data can be thought of as the representation of Identity in a transmediated, digital world. Its use is as a Noun due to its representation of a numerical/statistical element relating to the User, while at the same time it is a Verb for transmission and a representation of a possible action due to the type of data it represents.

Behaviour will represent the considered human response as well as the sub-conscious or accidental. This covers Law and policy frameworks as an explicit form of Behaviour as well as Social Media being the catch-all phrase to encompass all User actions (Verb). Involved in this at a complete level is Theory, that is enacted (Verb) not just by the Law but by the User while they use the technology. Theory as a Noun will be examined to discover current thinking, direction and possible concerns specifically in regards to Privacy and Identity.

Privacy relates to the idea (Noun) of it within the allied fields of IoT, Data and Technology. The latter is due to the need to examine the history of classification, concerns and implementation in computing and User interaction in general as well as this particular technology being the basis of the current state of concerns. Data Privacy relates to the collection, storage, processing and distribution ( and selling) of data sets and considers not just the explicit data points collected but also allows for aggregation and predictive association to fall under the concerns of Privacy.

IoT will specifically deal with the colleciton of data at the visible/obvious stage as well as at the invisible/obscured stage where Beacons (Noun and Verb) are placed (or delivered - consider a Beacon to be a payload as well) in public areas in a way that does not inform the public of their existence and purpose. Also of importance here is to include an analogy to a Hobson's Choice where a Beacon in a public/social space (consider a private/commercial space as well) with passive acceptance or active denial being the "choice".

Identity as a package is noted and is yet to be added in a meaningful way. It will consist of Noun Verb attributes, as it's definition and enaction (as a process) seem to align with thinking surrounding ideas of Identity and self, regardless of the research perspective.



PRECURSOR 1

future link

My honours research relates to the intersection of human and machine, specifically the idea of representation of identity in a transmediated world and privacy in an environment of pervasive computing that utilises wireless sensor networks and neural-net, machine-based learning algorithms that are intended to classify, predict and encourage types of human behaviour. Currently my research is limited to mobile phones, beacons and Internet-of-Things (IoT) sensors as generators of data points to be processed by networked machines. Surrounding this physical definition of parts and machine float the more human elements of privacy, behaviour and identity. While this may seem large in scope, my thesis itself will be focused upon a single, archetypal use-case consisting of a human carried mobile phone passing through a public space that has embedded sensors emitting near ultra-high frequency audio signals across its path. These signals invoke computational processes and network connections for the purpose of allowing communication between machine and machine, without user knowledge or interaction. This communication will involve descriptions ( a vector of data points ) of the user as they pass through the space and networked machines to attempt to determine/predict future user behaviour.

The precursor is not a prototype. The precursor is designed to raise questions and interrogate the concepts, assumptions and preconceived ideas. Conceptually the thesis will be, by necessity, grounded in engineering and computer science as that is the only logical perspective that I can utilise to ensure that understanding and claims are based upon a reasonable comprehension of this particular aspect of reality. The meaning of this, however, is not to be limited by that grounding, it needs to be contextualised and expanded within the field of human behaviour and identity.

To help determine what role privacy, identity and human agency may have in relation to my thesis, I devised a performance for my precursor that demonstrates human interaction with machine. The machine, being a networked computer, was programmed by the human, me, to create a poem. As it is not a computer that employs state-of-the-art algorithms to construct poetry-like verse, I manually programmed it to construct a verse by randomly selecting words from a vocabulary of 97,648 words. This procedural randomness was contained within an algorithm that was logically derived from an internet search for instructions on sentence construction.

Upon visiting the webpage that is the precursor, the page is rendered into human readable words that describe this process as well as the resultant poem. If the user presses a button labelled 'render' then the webserver machine makes the users web browser machine read the description and the poem out via a synthesized human voice. While reading this out, it plays the near-ultra high frequency rendition of the poem as well.

For the performance, this 'automatic' rendition was not enabled and instead I manually typed words into the voice synthesizer that formed a series (narrative) of questions and statements that were intended to reflect my own uncertainty about the relevance of certain conceptual problems to my project. The comfort of being in control of the process, even as a performance, is bound up with problems of identity and human agency, prompting the question: would the removal of this control be noticable in any state. So for me, as the archetype in this instance, I would need to be in control, to ensure its subservience to my (human) will. To correlate my response to an array of other human responses, I need to bypass or ignore reflective reasoning and assume the mantle of dispassionate observer. The rationale for this, as far as I am concerned, is that I cannot be the archetype.