Entry level/Junior Data Scientist or Data Engineer

Innaxis is currently seeking for Data Scientists and Engineers to join its research and development team based in Madrid, Spain. Talented and highly motivated individuals who want to pursue and lead a career outside of the more mainstream, conventional alternatives. Individuals with a great dose of imagination, problem solving skills, flexibility and passion are encouraged to apply.

  • As a Data Engineer, you will help the team to design and integrate complete solutions for Big Data architectures; from data acquisition and ETL processes until storage and delivery for analysis, using the latest technologies and solutions for the ultimate performance.
  • As a Data Scientist, you will mainly assist the team to understand, analyse and mine data, but also to prepare and assess the quality of such. You will also develop methods for data fusion and anonymization. Ultimately your goal will be to extract the best knowledge and insights from data, despite technical limitations and committing with regulatory requirements.

About Innaxis

If not unique, Innaxis is at most not conventional: it is a private independent non-profit research institute focused on Data Science and its applications: most notoriously in aviation, air traffic management and mobility, among other areas.

As an independent entity, Innaxis determines its own research agenda and has now a decade of experience in European research programs with more than 30 successfully executed ones. New projects and initiatives are evaluated continuously and open to new opportunities and ideas proposed within the team.

The Innaxis team consist on a very interdisciplinary group of scientists, developers, engineers and program managers, together with an extensive network of external partners and collaborators, from private companies to universities, public entities and other research institutes.

Skills wanted

Our team work very closely on a daily basis, so a broader knowledge means a much better coordination. Therefore, there is a unique list of skills ideally wanted for both positions. Those skills would be then weighted/assessed as requirements or “bonus points” according to the candidate’s position of interest, i.e. Data Scientist or Data Engineer.

  • University degree, MSc or PhD on Data Science or Computer Science, or related field provided all other requirements are met.
  • No professional experience required, although it might be positively evaluated.
  • Proficient in a variety of programming languages, for instance: Python, Scala, Java, R or  C++ and up to date on the newest software libraries and APIs, e.g. Tensorflow, Theano.
  • Experience with acquisition, preparation, storage and delivery of data,  including concepts ranging from ETL to Data Lakes.
  • Knowledge of the most commonly used software stacks such as LAMP, LAPP, LEAP, OpenStack, SMACK or similar.
  • Familiar with some of the IaaS, PaaS and SaaS platforms currently available such as Amazon Web Services, Microsoft Azure, Google Cloud and similar.
  • Understanding of the most popular knowledge discovery and data mining problems and algorithms; predictive analytics, classification, map reduce, deep learning, random forest, support vector machines and such.
  • Continuous interest for the latest technologies and developments, e.g. blockchain, Terraform,
  • Excellent English communication skills. It is the working language at Innaxis.
  • And of course, great doses of imagination, problem solving skills, flexibility and passion.

Benefits

The successful candidate will be offered a Innaxis’ position as a Data Scientist or Data Engineer, including a unique set of benefits:

  • Being part of a young, dynamic, highly qualified, collaborative and heterogeneous international team.
  • Great flexibility in many aspects -including working hours, compatibilities and location- and most excellent working conditions.
  • A horizontal hierarchy, all researchers’ opinions matter.
  • Long term and stable position. Innaxis is steadily growing since its foundation ten years ago.
  • A fair salary according to the nature of the institute and adjusted to skills, experience and education with continuous revision.
  • Independence, as a non-profit and research-focused nature of Innaxis, the institute is driven by different forces than in the private sector, free of commercial and profit interests.
  • The possibility to develop a unique career outside of mainstream: academics, private companies and consulting.
  • No outsourcing whatsoever, all tasks will be performed at Innaxis offices.
  • An agile working methodology; Innaxis recently implemented JIRA/Scrum and all the research is done on a collaborative wiki/Confluence.

Apply

Interested candidates should send their CV, a research interest letter (around 400 words) and any other relevant information supporting their application to recruitment@innaxis.org You will be contacted further and a personal selection process will start.

 

Information, time, knowledge

We live in a world that gathers exponentially increasing amounts of information/data coming from endless sources, and a limited time to analyse it.

What is the current speed of “creating” information/data? What about knowledge/wisdom? What is the role of Data Science and Big Data in this context?

Food for thought for your -deserved- summer break! Enjoy, charge your batteries and get ready for a 2016/2017 year full of cutting-edge research, innovation (and Innaxis blogposts!)

 

tumblr_o5px3cKzOs1rj9sw5o1_500

Guardar

Innaxis at EASA-OPTICS conference. Cologne 12-14 April

Developing the future of a safe and growing aviation business, whilst also reassuring the travelling public that it is safe to fly, is a major vision for both EU and national aviation policies, however:

What role do policy makers play?

What are the recent, implemented safety measures?

Who is guiding the safety topics within aviation research?

EASA, the European Commission, the Advisory Council of Aviation Research & Innovation in Europe (ACARE), and the EU’s OPTICS Project organised a three day event in Cologne (12-14 April) in order to provide answers to these types of imperative questions, and furthermore define the way forward to ensure continued aviation safety in Europe. The event had a number of presentations and workshops within several aviation safety areas.

Two Innaxis’ team members David Perez (dp@innaxis.org) and Hector Ureta (hu@innaxis.org) attended the interesting event and took part in several of the workshops, explaining how can Data Science and BIG data can boost aviation safety. Hector  also presented some of the latest data science techniques and tools in safety research, based on SESAR-COMPASS project, during the third day of the event.

 

IMG_20160414_144630

Hector Ureta (Innaxis) presenting the Data Science research done in COMPASS (Cologne 14 April 2016)

 

The presentation, “Data science and data mining techniques to improve aviation safety: features, patterns and precursors”, is available online in this link.

If you’d like further information about data science in aviation, big data or aviation safety research completed by Innaxis, please feel free to contact Innaxis team (innovation@innaxis.org).

 

FullSizeRender

More details of the event available in EASA and OPTICS websites:

Guardar

Wh- Questions about Data Science

The five Wh’s of Data Science – What, Why, When, Who and Which.

While preparing the upcoming October workshop in Data Science, Innaxis has gathered wh- questions and simple answers about the “new reality” of data science. We also provide links to pages where more information about these important questions have been provided.

What?

The basic answer to what is Data Science could be “a set of fundamental principles that support and guide the principled extraction of information and knowledge from data”. Definitions, especially of new terms should remain simple despite the urge to make them complicated. Furthermore, the boundaries of Big Data, Data Science, Statistics and Data Mining definitions are not so discernible and include common principles and tools and, importantly, the same aim: extraction of valuable information.

Why?

What is the reason for extracting information from data? There is a brilliant quote by Jean Baudrillard “Information can tell us everything. It has all the answers. But they are answers to questions we have not asked, and which doubtless don’t even arise” In this context, proper data science is [ generally ]  neither basic science nor long term research; it is considered an extremely valuable resource for the creation of business. Mining large amounts of both structured and unstructured data to identify patterns that can directly help an organization in terms of costs, in creating customer profiles, increasing efficiencies, recognizing new market opportunities and enhancing the organization’s competitive advantage.

When?

Through history, an extensive list of names have been given to a well known duality: information=power;  from the middle ages census to the Royal Navy strategies based on statistical analysis. Concerning the current understanding of Data Science, its name has moved away from being a synonym for Data Analysis in the early 20th century to being associated, from the nineteen-nineties, with Knowledge Discovery (KD). One of the very best compilations of data science history and publications over the last 60 years can be found in this Forbes article.

Throughout history, the various methods and tools used have changed, developing as both the mathematical, extraction and software and hardware capabilities have increased in recent years. The consequent “sudden” eruption in Data Science jobs,  which identifies the market’s real interest in those potential benefits that knowledge extraction offers, is visually described with the following graph taken from Linkedin analytics:

Courtesy LinkedIn Corp.

Who?

If you are a lawyer or a doctor everybody knows more or less your level of education at university and the nature of your daily tasks. What is then a “Data Scientist”? The clear paths that could lead to a Data Science career are not so defined and are difficult to identify. The so called “Sexiest Job of the 21st century” (according to the Harvard Business Review), needs a common definition and even specific university degrees.  The data jockeys that have always been employed in Wall Street are no longer alone. Meanwhile the scope and variety of data now available is a non-stop, growing, force resulting in operational, statistical and even hacking backgrounds being welcome to extract value from it. More information about data scientist careers and the main disciplines can be found in this excellent article from naturejobs.com.

In order to understand Data Science job titles, we recommend you also have a look at this article by Vincent Granville from DataScienceCentral. It’s a living tongue twister: data mining activity done by a data scientist regarding data scientist job titles. Summing it up, it is pretty similar to the following recipe: Take a mixer from the kitchen; add the words “Data” “Analytics” “Scientist”; switch it on; include some institutional label “director” “Junior” “Manager”. An additional optional topping could be your university degree “engineer” “mathematician”. There you have one of the possible names of current data scientist.

Which?

Which data is “datascience-able”? As we described in our previous post about Data Science, there is huge potential in almost every imaginable field that could provide sufficient quality data for analysis. Although, even where the date is available, there are challenges faced,  generally connected with data storing and managing capabilities. These challenges are covered in detail in the Innaxis blogpost, “The benefits and challenges of Big Data”. One of the remarkable and exciting things about Data Science is that there is additional knowledge to extract from data sets that at first sight are not expected to provide anything beyond the obvious potential from the so called “direct” datasets. The reality is it’s hard to know which data sets will add value before testing them with Data Science. When discovered, hidden patterns and unseen correlations are really adding more valuable knowledge to entities than direct cause-and-effect relationships. They represent being one step ahead, which is crucial in the highly competitive world in which we are living.

By Héctor Ureta – Collaborative R&D Aerospace Engineer at Innaxis

 

 

Guardar

Wh- Questions about Data Science

The five Wh’s of Data Science – What, Why, When, Who and Which.

While preparing the upcoming October workshop in Data Science, Innaxis has gathered wh- questions and simple answers about the “new reality” of data science. We also provide links to pages where more information about these important questions have been provided.

What?

The basic answer to what is Data Science could be “a set of fundamental principles that support and guide the principled extraction of information and knowledge from data”. Definitions, especially of new terms should remain simple despite the urge to make them complicated. Furthermore, the boundaries of Big Data, Data Science, Statistics and Data Mining definitions are not so discernible and include common principles and tools and, importantly, the same aim: extraction of valuable information.

Why?

What is the reason for extracting information from data? There is a brilliant quote by Jean Baudrillard “Information can tell us everything. It has all the answers. But they are answers to questions we have not asked, and which doubtless don’t even arise” In this context, proper data science is [ generally ]  neither basic science nor long term research; it is considered an extremely valuable resource for the creation of business. Mining large amounts of both structured and unstructured data to identify patterns that can directly help an organization in terms of costs, in creating customer profiles, increasing efficiencies, recognizing new market opportunities and enhancing the organization’s competitive advantage.

When?

Through history, an extensive list of names have been given to a well known duality: information=power;  from the middle ages census to the Royal Navy strategies based on statistical analysis. Concerning the current understanding of Data Science, its name has moved away from being a synonym for Data Analysis in the early 20th century to being associated, from the nineteen-nineties, with Knowledge Discovery (KD). One of the very best compilations of data science history and publications over the last 60 years can be found in this Forbes article.

Throughout history, the various methods and tools used have changed, developing as both the mathematical, extraction and software and hardware capabilities have increased in recent years. The consequent “sudden” eruption in Data Science jobs,  which identifies the market’s real interest in those potential benefits that knowledge extraction offers, is visually described with the following graph taken from Linkedin analytics:

Courtesy LinkedIn Corp.

Who?

If you are a lawyer or a doctor everybody knows more or less your level of education at university and the nature of your daily tasks. What is then a “Data Scientist”? The clear paths that could lead to a Data Science career are not so defined and are difficult to identify. The so called “Sexiest Job of the 21st century” (according to the Harvard Business Review), needs a common definition and even specific university degrees.  The data jockeys that have always been employed in Wall Street are no longer alone. Meanwhile the scope and variety of data now available is a non-stop, growing, force resulting in operational, statistical and even hacking backgrounds being welcome to extract value from it. More information about data scientist careers and the main disciplines can be found in this excellent article from naturejobs.com.

In order to understand Data Science job titles, we recommend you also have a look at this article by Vincent Granville from DataScienceCentral. It’s a living tongue twister: data mining activity done by a data scientist regarding data scientist job titles. Summing it up, it is pretty similar to the following recipe: Take a mixer from the kitchen; add the words “Data” “Analytics” “Scientist”; switch it on; include some institutional label “director” “Junior” “Manager”. An additional optional topping could be your university degree “engineer” “mathematician”. There you have one of the possible names of current data scientist.

Which?

Which data is “datascience-able”? As we described in our previous post about Data Science, there is huge potential in almost every imaginable field that could provide sufficient quality data for analysis. Although, even where the date is available, there are challenges faced,  generally connected with data storing and managing capabilities. These challenges are covered in detail in the Innaxis blogpost, “The benefits and challenges of Big Data”. One of the remarkable and exciting things about Data Science is that there is additional knowledge to extract from data sets that at first sight are not expected to provide anything beyond the obvious potential from the so called “direct” datasets. The reality is it’s hard to know which data sets will add value before testing them with Data Science. When discovered, hidden patterns and unseen correlations are really adding more valuable knowledge to entities than direct cause-and-effect relationships. They represent being one step ahead, which is crucial in the highly competitive world in which we are living.

By Héctor Ureta – Collaborative R&D Aerospace Engineer at Innaxis

 

 

Guardar

Guardar

Turning Big Data into Knowledge

In this two-part blog post we first look at the emergence of Big Data and the challenges it brings. In the next post we take a look at how these challenges are being addressed and the benefits this will unlock.

The emerging challenge of Big Data

Over the first years of the third millennium, worldwide digital data experienced huge growth, from scarce to super-abundant. Produced either by high-tech, scientific experiments or simply compiled from the now ubiquitous sources of automatic data collection through ordinary, every day transactions, this new reality of “Big data” -or being visually precise: “BIG DATA”- has resulted in the need for large-scale management and storage of data which cannot be handled with conventional tools.

Data management tools and hard drive capacity is not increasing fast enough to keep up with with this explosion in digital data world wide. While in economic production we are increasingly asked to “do more with less”, in contrast, in relation to data we are increasingly asked to “do more with more”.

What is the impact of this new reality and its potential benefits for the world of scientific research? Living in a world where economies, political freedom, social welfare and cultural growth increasingly depend on our technological capabilities, Big Data management, and most importantly, the knowledge that can be obtained from it, has enormous potential to benefit individual organizations.

There will be 2 parts covering this interesting reality: the first one including the current introduction and main big data sources, the second part will explain the Big Data challenges and benefits

Sources of Big Data

There are two common provenances of Big Data: On one hand, scientific experiments and tools, which were the first origin of Big Data specific study, mostly from the physics field, involving either macro or micro spatial scales. In the natural science field there is also, latterly, some biology studies, in particular, the DNA research field, starting to make use of Big Data.

On the other hand, one of the other significant sources is simply “everyday” data, the vast quantity of information that is now collected everyday at a million points of citizen interactions, collected through billions of worldwide embedded sensors.

Prepare for some big numbers:

Physics: Large Hadron Collider (LHC):

The world’s largest and highest-energy particle accelerator and one of the greatest engineering milestones ever achieved, the LHC produces around 25 petabytes of raw data per year capturing information for the over 300 (3×10^14) trillion proton-proton collisions. The information management is not easy even making use of the world largest computer grid (170 computing centres in 36 countries). The extraction of information and knowledge from these particularly huge datasets enabled the recent discovery of the Higgs Boson or “god particle” , a discovery that will probably result in the team behind the discovery being awarded the 2013 Physics Nobel prize.

Astronomy:

When the telescope from the Sloan Digital Sky Survey (SDSS) opened in 2000, it collected in one week more data than had been amassed in the entire history of astronomy. The new Large Synoptic Survey Telescope (LSST) commencing in 2020, will store in 5 days the same amount of data that SDSS will have collected over the 13 years since its inception. The storing and processing of these massive data sets from the gigapixel telescopes on the earth’s surface and in space, requires very specific tools that have been beyond the current state of the art. Consequently, astronomy, while trying to extract knowledge to create the most accurate “universe map”, is one of the leading protaganists in the field of Big Data.

Everyday data

This is the kind of data collected by countless automatic recording devices that collect data on what, how, and where we purchase, where we go and more. Its really outstanding how our lives have changed in the last couple of decades. All of these improvements and the inherent multiplication in consumption and goods, the resulting transactions, communications and more are being captured through hundreds of receptors.  In addition, user-generated content like digital media files, video, photos and blogs are being generated and stored on an unprecedented scale. Our locations (GPS-GLONASS-Galileo), money transactions (credit card, NFC payments etc), several different forms of communication and even what we think and we do in our free time (via social networks) is being collected by different corporate and government bodies.

One of the most accurate ever studies, published in the journal, ‘Science’ in 2007, revealed that humanity might store in that year around 295 exabytes (1 exabyte = 1,000,000terabytes) of data. The global data of 2009 was calculated to have reached 800 exabytes, meanwhile by the end of 2013 it is forecast to reach more than 3 zettabytes, (3*10^21 bytes, 3000000000000000000000 bytes). Impressive. Many challenges obviously arise with a, roughly, 60% yearly increase in data to be handled and issues abound in relation to how to process and extract useful information from what is 95% raw data.

In our next post we’ll look at how these challenges are being addressed.

Connect with us!