Default hero image

Utilising data – Open & Big Data in construction

The UK is now becoming a much more digitised community, and data is being collected in a range of different ways. However, it is now important that organisations truly understand how to successfully analyse this data in order for improvements to be made. This article discusses how data can be used to inform decisions, and how opening datasets has the potential to drive economic, social and environmental benefit.

This is the second in a series of three articles based on a study carried out by BRE in collaboration with Constructing Excellence group Generation for Change to investigate the potential for data in the built environment. The study looked to demonstrate the potential benefits that collecting, managing, analysing and even releasing data can have on a range of organisations within the construction sector.

Mining

Data mining is an analytic process which can be undertaken in order to identify patterns, relationships and correlations from within a dataset (Dell, Data Mining Techniques). This can also include looking for anomalies. For example, data mining can be used to spot anomalies when looking at any type of consumption (water, electricity, gas). By identifying anomalies within consumption data it allows an organisation to try and understand what has caused atypical readings at a particular location or point in time. Mining can also be used to detect patterns within datasets. For example if a site manager records all of the near misses witnessed on a construction site, it is possible that mining could be used to establish if there are patterns in the data. This would enable the site manager to identify the areas which are currently posing the highest risk, and put in place measures that could reduce this. It is this sort of analysis which can help aid organisations to become increasingly efficient in everything they do.

Mapping

One of the most effective ways to utilise data, is to map it. Provided you have geospatial data (data that contains a geographical location), it can placed upon a map. Figure 1 demonstrates how data regarding fuel poverty in London has been successfully mapped in order to clearly distinguish the most prominent areas. This can help to aid government departments in developing new policies to overcome these issues. Mapping is a particularly effective way in displaying data due to our familiarity with digital maps, such as Google Maps. People can quickly relate things to a map, rather than if the same data was portrayed in a spread sheet. It is felt that mapping is particularly useful in portraying information within the construction industry, due to those within the sector having substantial interaction with maps and plans on a daily basis.

https://youtu.be/ym9qkOwNuBc

Figure 1, Distribution of fuel poverty in London Map based on BRE housing stock models and Experian data taken from the Cost of Poor Housing in London IHS BRE Press 2014
Figure 1 Distribution of fuel poverty in London Map based on BRE housing stock models and Experian data taken from the Cost of Poor Housing in London IHS BRE Press 2014

Back to top

Mashing

The real power within mapping can be seen when you start to combine data sets (which is often referred to as mashing or crunching) in an attempt to find correlations. For example, if you were to map data which demonstrates the percentage of waste sent to landfill on a range of construction projects throughout Scotland, it may appear that there is one particular location where landfilling predominates, rather than recycling. With this one dataset it would be difficult to establish why this might be. However, if you were to combine this data with the locations of all of the waste recycling facilities within the surrounding areas it may become apparent that this location does not have a facility within close proximity, and subsequently it is easier and cheaper for contractors in this area to just landfill their construction waste. This method enables you to combine datasets in an attempt to understand whether there are correlations between different variables. Subsequently, this allows measures to be put in place to overcome problems or spot opportunities for improvement.

https://youtu.be/M0RkK2AB5ks

Utilising open data sources: Reducing the risk!

Case Study: Democrata

Democrata understood the risk many construction projects face when it comes to unexpected archaeological finds. The discovery of historically important remains can lead to a loss of both time and money, and with many projects currently running on small margins it is important to understand the risk involved. With help from the Hartree Centre and its partner IBM, Democrata utilised open datasets to try and reduce this risk. By using data regarding previous discoveries, settlement patterns and landscape characteristics held by organisations such as English Heritage, Ordnance Survey and the Land Registry, Democrata were able to provide an estimated risk for 200 million points throughout the UK.

The benefit of this to the construction industry is that it enables an organisation to consider this issue of archaeology when selecting a potential site. Furthermore, when a site has been decided upon, the scheduling and budgeting can take into account the attributed risk that this particular location faces. This can help to reduce the impact any archaeological find will have upon development.

Back to top

Driving the release of open data within construction

A large amount of the open data already released has been done so by the public sector. However, there are clear visible benefits for the public sector to undertake its release such as data.gov.uk. First of all it provides transparency. By releasing data and information regarding a local authority, it demonstrates a desire to be open and honest with its inhabitants. Secondly, by releasing data they provide developers with information that enables them to develop new and innovative applications which not only create economic growth, but can also provide wider benefits to society. Finally, another benefit that is seen by the public sector due to the release of data is the reduction in freedom of information (FOI) requests from the public. Manchester City Council has already seen an improvement in their efficiency which has led to substantial savings. It is these clear and distinct benefits which has made the public sector the forerunner in the open data agenda.

Although a large proportion of the case studies surrounding open data have arisen from public sector data, there are also drivers which are pushing the release of private sector data. The desire to open up datasets in order to appear transparent to the population is not just of importance to the public sector. It is felt that corporate social responsibility (CSR) is currently driving many large organisations (most notably clothing manufacturers) to publish data on their sustainability as well as their supply chain. Another driver pushing data related collaboration is when sharing datasets between companies can actually help to reduce a common risk. CIO UK  demonstrates that insurers have been undertaking initiatives like this for many years in an attempt to combat fraudulent claims. This could have similar benefits for organisations in the construction industry. By identifying high risk areas of a construction project (for example planning), it could be beneficial for companies to share data in this area to mutually reduce risk and improve efficiencies.

When discussing the topic of open data throughout this project, it became clear that people find it very easy to understand why a local authority would start to release its datasets. However, as soon as the topic changes to the release of private sector information the argument loses momentum. People continue to remain wary of giving away information that they believe to be of value to potential competitors. There is a perception that data is like a rare object and that once you have given it away you have lost all of the value you had with it. However, when interviewing Ben Cave, from Citadel on the Move, he describes that this is not always the case. Cave suggests that we need to rephrase the benefits of open data. Currently it is seen as simply for social good, and although releasing data can be of benefit to society, it can also have an economic benefit for those who release it. By getting people to discuss how innovation can arise from your datasets is a huge commercial benefit, and could even be looked upon as free research and development. Furthermore, by allowing completely different sectors (which are not hidebound by your industries current processes) to analyse your data, you may find uses which people internally would never have thought of.

https://youtu.be/KXfFs_hm5E0

Back to top

Business models for releasing open data

The ODI has undertaken great work to better explain the potential business models which can help organisations get value from releasing their data. The work concentrates on several business models including freemium and cross subsidy.

The ODI explains that the freemium model involves an organisation releasing a sample dataset or tool for free (perhaps a small portion of their dataset), which in turn will create interest in a more in-depth product which has the potential to be sold (for example the entire dataset). By allowing people access to your datasets it enables them to understand the potential the entire dataset could offer. This business model of using a sample to entice people to purchase more is a commonly undertaken approach, and within construction it is often seen within market research. For example, organisations often provide small summaries as well as several key insights on a particular topic in an effort to tempt people to purchase the full-price report.

There are alternative ways to utilise the freemium business model. Organisations can also release their data under a license which requires those who use it to openly share their findings. This can be extremely beneficial as this can result in the development of new uses for your data that your organisation had never considered. Alternatively, if a developer does not want to release their findings after using your organisations data, then they must negotiate with you to attain the data under an alternative licence. This allows the new licence agreement to be charged, subsequently creating value.

Another proposed model is cross subsidy. This is where alternative benefits can arise within your organisation from the release of open data. One simple benefit is that releasing data can be a great source of advertising. An organisation can attach a licence requiring anyone who uses their data to attribute their work in collecting it. A further benefit can actually be additional business opportunities. For example, if an organisation which collects data on the efficiency of construction projects decided to release a number of their datasets, it is extremely likely that they would have the most knowledge and experience of using that dataset. With that in mind it is possible for this organisation to offer a range of consultancy services to those looking to successfully utilise the dataset. Furthermore, as it is increasingly important to have the most recent datasets, there is the opportunity to charge for adaptations to the originally released dataset.

Opening up data and tools: Taking the leap!

Case Study: Space Syntax

Space Syntax is an urban planning and building design company. One of the original pioneers of the “science of cities”, Space Syntax provides software that predicts the impact of planning and transport decisions (such as street connectivity and land use location) on human behaviour patterns and property values.

After discussing the possibility of opening up their core software to the world, Space Syntax and University College London took the leap in 2010 and made it open-source. Open-sourced means that Space Syntax allows anyone to look at their programming code, and make alterations to it. To many this would be considered counter-intuitive, risking the commercial position that Space Syntax had carefully nurtured over the previous 20 years. However, 5 years of increasingly positive results have shown that they have managed to do this without having a detrimental impact upon their company – indeed the opposite.

Although several people have started to adapt the open-sourced software without the help of Space Syntax, many individuals or organisations do not actually have the resources or inclination to undertake a project like this alone. Subsequently, many customers still return to Space Syntax as they consider them best placed to deliver their needs.

This move to open up their software has been undertaken to try and improve the quality of the industry that they are in. Space Syntax do not want to be the best within a poor quality market; they would rather be in a more thriving area, and competing for top spot within it. It may not be perceived as the traditional route, however Space Syntax feel this is actually the most effective way of creating commercial interest. That much so, that they are in the process of opening up even more of their previously proprietary knowledge. A key element in this strategy is the Space Syntax Online Training Platform which provides people with a range of case studies and online manuals, enabling people to truly understand how to utilise the software.

The next step for Space Syntax is to publish the many datasets they hold on spatial form and urban performance.

Watch this space…

https://youtu.be/IF7Mjl4BJn4

Although we have identified a range of benefits, this does not necessarily mean that organisations should start releasing all of their datasets. However, it does demonstrate that non-traditional business models exist. Organisations need to initiate conversations surrounding open data, and start to understand the potential benefits that it could bring them, whether they would like to utilise open data that has already been released by other organisations, or that they are thinking of releasing their own. Figure 2 illustrates a few basic considerations which organisations should be considering when they are thinking of becoming involved with open data.

Figure 2, Basic considerations when utilising or releasing open data
Figure 2 Basic considerations when utilising or releasing open data

Back to top

Barriers within the construction industry

One of the major barriers to the mass uptake of data utilisation within the construction industry remains to be the lack of knowledge within the sector. During a discussion with Professor Tim Stonor, Managing Director of Space Syntax, he explains that the construction industry is currently trailing when compared to other sectors, not only in its ability to collect and manage data, but also in its ability to successfully analyse it. Due to this, there is a need to be realistic in what can be achieved at this moment in time. There is still a perception within many companies that data collection and management is a burden, and in order to move this forward we need to alter this perception to one of opportunity and potential.

Stuart Chalmers, Director for Digital Products within BRE, followed on from this point to illustrate that one of the most prominent barriers to the collection, management, and release of data is the procedural change required to facilitate this. In order for many organisations to participate in this area they may be forced to alter operating procedures that they could have undertaken for decades.

https://youtu.be/IFAYJHbYCsM

Another barrier which is believed to be currently reducing data related collaboration within the construction industry supply chain is the substantial variation between the organisations within it. There is huge potential for improved collaboration if companies that worked with one another could successfully share data. However, it was debated that larger companies often find it easier than SMEs to source the necessary resources, and in some case knowledge, which enables them to collect, manage and release data. While conversely, those smaller companies which do work successfully with data can find it frustrating at the often slow pace which large organisations move at, as well as their lack of agility. It is barriers like this which suggest that innovative ideas will be required in order to overcome these challenges.

Back to top

Improving participation within the industry

people within this community are becoming much more aware of this. As people increase their knowledge of data, and the possibilities surrounding it, it is likely that processes in the future will be designed in a way that considers a data driven approach. Subsequently, it is believed that there will be a significant need for tools and software capable of utilising this data at every scale within the construction industry.

The European Commission have understood the need for this, and subsequently funded a project, Citadel on the Move, which has made huge steps in demonstrating that applications can be easily developed through the use of open data. Citadel on the Move has determined that a lack of standardisation during data collection makes it much more difficult to undertake efficient data analysis. Subsequently Citadel has been working with organisations to enable them to produce data that is in a reusable and interoperable format. Citadel identified that most people were unaware of how to turn open data into a usable application. With this in mind, they developed a template enabling anyone to create a mobile based application which utilises their data. There are several examples of applications that have been created online using this template, including one that allows people within Gent to have real-time data on the availability of parking within the city.

Creating applications with the help of templates

Case Study: City of Ghent – Realtime parking availability in Ghent

One of the highest rated applications that have been built using the templates provided by Citadel on the Move is the ‘Realtime parking availability in Ghent’ www.citadelonthemove.eu/ExploreMyCitadel/AppCatalogue/tabid/233/agentType/View/appID/214/Realtime-parking-availability-in-Ghent.aspx. This application successfully takes real-time data from car parks in Ghent to provide an application which demonstrates car space availability throughout the city. This application places availability data onto an easy to use map, allowing users to access up-to-date information on whether a particular car park is full, preventing people consuming time and money on searching the city for a parking space (Figure 3).

The Citadel on the Move project was actually developed with the help of the City of Ghent, and this parking application was a great opportunity to illustrate how successful the template could be. The City of Ghent are now focusing on opening up more datasets, while simultaneously improving its quality and standardisation. By providing this information it is believed that it will drive the development of more useful applications.

It is examples like this which demonstrate the potential which tools such as Citadel on the Move can offer.

Figure 3. Realtime parking availability in Ghent Application from City of Ghent online www.toegent.be/gentmobiliteit/
Figure 3 Realtime parking availability in Ghent Application from City of Ghent online wwwtoegentbegentmobiliteit

https://youtu.be/VOC2ul0y2c4

Another way in which people have been utilising open data to make improvements to either their business, or (selflessly volunteering) to improve the society we live in, is by undertaking ‘hackathons’. First of all it must be clarified that although the term hacking is often linked to computer security it was not the source of the term. Hacking actually describes the process of altering a system to provide an outcome which was not originally envisaged by the system’s creator. Subsequently a hackathon is actually where people come together and use a range of datasets to provide new and innovative solutions for a predetermined problem. These events can sometimes last several days, and attendees will often work through the night.

Developing new ideas through Hackathons

 Case Study: Flood Hack

A great example of how Hackathons have been used to solve problems, or drive change, is with the Flood Hack hosted by Tech City UK in February 2014.

This all started due to the unusual weather witnessed between December 2013 and May 2014. Substantial rain led to scenes of flooding all over the country. During this period the Environment Agency (EA) were forced to make 50 severe flood warnings, which is the highest rated warning that they release. An emergency meeting at 10 Downing Street prompted a Hackathon event to take place, and within 24 hours over 200 developers had signed up to attend. The EA went through great lengths in order to make their live flood warnings data open for utilisation during the Hackathon.

The event led to many new ideas being developed which could help those affected by the floods at that time, as well those who may be affected by future flooding. One developed application enabled those who had encountered a power cut to find the appropriate District Network Operator (DNO) details to contact. Simply all the application required users to do, was enter their current location.

Although this Hackathon was designed to make societal improvement, this doesn’t always have to be the case. Organisations can undertake these events to help develop concepts internally. This example demonstrates the success that can be seen through undertaking Hackathons, and suggests that they can be a great way of promoting ideas.

It is events like hackathons which can help to start the formation of partnerships surrounding data. It has already been discussed that large companies often do not have the agility required to develop internal procedures that are capable of analysing multiple data sets to develop new business opportunities. With this in mind it is becoming an increasingly attractive proposition for large organisations to collaborate with relatively small software developers to help overcome this. This enables large organisations to move what they perceive as a substantial risk, to a smaller developer which is much better suited to this role. Furthermore, by teaming up with a large organisation, it offers small developers a much more reliable revenue stream as well as a large continual quantity of data. This set-up manages to reduce the risk being placed on both parties, and it is why it carries so much potential.

Back to top

Background

In 2014, the BRE Trust commissioned the study ‘Open & Big Data in Construction’, which was undertaken by BRE in collaboration with Generation for Change (G4C, part of Constructing Excellence). This project set out to engage with the construction industry to discuss, and identify solutions to, the barriers currently slowing the use of open and/or big data within the built environment. In order to achieve this, the study looked to demonstrate the potential benefits that collecting, managing, analysing and even releasing data can have on a range of organisations within the construction sector.

One of the most prominent barriers to data utilisation identified during the study was that those who are relatively new to the area often feel overwhelmed by the technical language used. Subsequently, this report also looks to simplify wording and clearly explain some of the most important issues. The main objective from this study is to raise awareness of the benefits that data utilisation can offer, and simultaneously increase the level of data literacy along the supply chain. A combination of structured interviews, as well as debate events have been undertaken to stimulate widespread opinion, as well as to provide expert knowledge on this area.

Back to top

Acknowledgements

This project relied on input and support of a number of people from across the industry:

  • Antonio Pisarno (Marcel Mauer Architects / G4C)
  • Stuart Chalmers (BRE)
  • Professor Tim Stonor (Space Syntax)
  • Stephen Wooldridge (Barratt Homes)
  • Ben Cave (formerly worked with Citadel on the move, now with the ODI)
  • Tom Brown (Lambeth Council)
  • James Johnston (Open Utility)

References

DECC (online). Where to find help in a power cut. Date accessed: 26/08/2015

Garvin, S. A Future Flood Resilient Built Environment. White Paper. BRE Trust, 2014

Rose, M. (online). Friday Lunchtime Lecture: The Environment Agency’s open data journey. Date accessed: 26/08/2015

Tech City (online). Tech City UK convenes developer talent and tech community to host #floodhack in Shoreditch. Date accessed: 26/08/2015

Whatishacking (online), What is hacking? Date accessed: 26/08/2015

Cabinet Office (online). Local Open Data Champions. Date accessed: 26/08/2015

CIO UK (online), Can open data work for the private sector? Date accessed: 26/08/2015

Citadel on the Move (online). Citadel open data. Date accessed: 26/08/2015

Citadel on the Move (online). Realtime parking availability in Ghent. Date accessed: 26/08/2015

Data.gov.uk (online). About. Date accessed: 26/08/2015

Dell (online). Data Mining Techniques. Date accessed: 26/08/2015

Democrata (online). Democratising Data & Insight. Date accessed 26/08/2015

Garrett, H., Davidson, M., Nicol, S., Roys, M. and Summers, C. The cost of poor housing in London. FB 65. Bracknell, IHS BRE Press, 2014

Hartree Centre: Science & Technology Facilities Council (online). Revealing the value of open data. Date accessed: 26/08/2015

Open Data Institute (undated), How to make a business case for open data. Date accessed: 26/08/2015

Back to top