Data analysis in SQL Server means of Python

V. Fedko
Annotations languages:


Description: The possibilities of data analysis tools are considered. Methods of data storage adapted to the effective execution of data analysis requests are described, as well as the language tools presented in the Microsoft SQL Server component as Machine Learning Services (in-database). Comparison of operational databases (OLTP-systems) and data warehouses, which are focused on data analysis (OLAP-systems) are compared. Examples of both systems are given, and the system of their interaction (ETLsystem) is considered. Describes data analysis tools, which in the simplest cases are applied to OLAP-cube. Presented are the language tools for performing data analysis in more complex cases. Comparison of the R and Python languages is performed, from which it follows that the Python language allows you to build complete data processing applications, and the libraries in it are almost the same as in the R language. It is shown that, given the great popularity of language analysis tools in the latest issues of SQL Server included the SQL Server R Services component, resulting in new features in SQL Server that circumvented the restriction that all data must be stored in memory. Describes the main advantages of the Machine Learning Services component, as well as the features of its installation. Demonstrated on specific examples of the possibility of performing calculations and graphical representation of results in Python in a SQL Server environment for data analysis.


Keywords: Business intelligence, Data mining, Data Scientist, Data Engineer, SQL Server, Machine Learning Services, operating database, data warehouse, R and Python languages, data analysis, data visualization

Reference:
Fedko, V.V. (2018), “Analiz danykh v SQL Server zasobamy Python” [Data analysis in SQL Server means of Python], Scientific Works of Kharkiv National Air Force University, Vol. 2(56), pp. 99-104. https://doi.org/10.30748/zhups.2018.56.14.