International Science Index
International Journal of Computer and Systems Engineering
A Data-Driven Support Vector Regression -Based Leaflet Modeling Approach for Personalized Aortic Valve Prostheses
While the aortic valve geometry is highly patient-specific, state-of-the-art prostheses are not aiming at reproducing this individual geometry. One challenge in manufacturing personalized prostheses is the mapping from the curved 3D shape extracted from imaging modalities to the planar 2D leaflet shape. To address this problem, a database was set up to evaluate valve leaflet shape models. First, 3D ultrasound images of ex-vivo porcine valves were acquired under physiologically realistic pressure to extract geometric key parameters describing the individual geometry. In a second step, the valves’ leaflets were cut out, spread on an illuminated plate and photographed in this state. From these images, the leaflet shape was extracted using edge detection. This database allows the derivation of a data-driven leaflet model utilizing machine learning, i.e. non-linear Support Vector Regression (SVR). Additionally, a state-of-the-art leaflet shape model was evaluated on the dataset. The data-driven approach provided an acceptable leaflet shape estimation and clearly outperformed the state-of-the-art model. This presents an important step towards personalized aortic valve prostheses.
Generation of Knowlege with Self-Learning Methods for Ophthalmic Data
Problem and Purpose: Intelligent systems are available and helpful to support the human being decision process, especially when complex surgical eye interventions are necessary and must be performed. Normally, such a decision support system consists of a knowledge-based module, which is responsible for the real assistance power, given by an explanation and logical reasoning processes. The interview based acquisition and generation of the complex knowledge itself is very crucial, because there are different correlations between the complex parameters. So, in this project (semi)automated self-learning methods are researched and developed for an enhancement of the quality of such a decision support system. Methods: For ophthalmic data sets of real patients in a hospital, advanced data mining procedures seem to be very helpful. Especially subgroup analysis methods are developed, extended and used to analyze and find out the correlations and conditional dependencies between the structured patient data. After finding causal dependencies, a ranking must be performed for the generation of rule-based representations. For this, anonymous patient data are transformed into a special machine language format. The imported data are used as input for algorithms of conditioned probability methods to calculate the parameter distributions concerning a special given goal parameter. Results: In the field of knowledge discovery advanced methods and applications could be performed to produce operation and patient related correlations. So, new knowledge was generated by finding causal relations between the operational equipment, the medical instances and patient specific history by a dependency ranking process. After transformation in association rules logically based representations were available for the clinical experts to evaluate the new knowledge. The structured data sets take account of about 80 parameters as special characteristic features per patient. For different extended patient groups (100, 300, 500), as well one target value as well multi-target values were set for the subgroup analysis. So the newly generated hypotheses could be interpreted regarding the dependency or independency of patient number. Conclusions: The aim and the advantage of such a semi-automatically self-learning process are the extensions of the knowledge base by finding new parameter correlations. The discovered knowledge is transformed into association rules and serves as rule-based representation of the knowledge in the knowledge base. Even more, than one goal parameter of interest can be considered by the semi-automated learning process. With ranking procedures, the most strong premises and also conjunctive associated conditions can be found to conclude the interested goal parameter. So the knowledge, hidden in structured tables or lists can be extracted as rule-based representation. This is a real assistance power for the communication with the clinical experts.
Gradient Boosted Trees on Spark Platform for Supervised Learning in Health Care Big Data
Health care is one of the prominent industries that generate voluminous data thereby finding the need of machine learning techniques with big data solutions for efficient processing and prediction. Missing data, incomplete data, real time streaming data, sensitive data, privacy, heterogeneity are few of the common challenges to be addressed for efficient processing and mining of health care data. In comparison with other applications, accuracy and fast processing are of higher importance for health care applications as they are related to the human life directly. Though there are many machine learning techniques and big data solutions used for efficient processing and prediction in health care data, different techniques and different frameworks are proved to be effective for different applications largely depending on the characteristics of the datasets.
In this paper, we present a framework that uses ensemble machine learning technique gradient boosted trees for data classification in health care big data. The framework is built on Spark platform which is fast in comparison with other traditional frameworks. Unlike other works that focus on a single technique, our work presents a comparison of six different machine learning techniques along with gradient boosted trees on datasets of different characteristics. Five benchmark health care datasets are considered for experimentation, and the results of different machine learning techniques are discussed in comparison with gradient boosted trees. The metric chosen for comparison is misclassification error rate and the run time of the algorithms. The goal of this paper is to i) Compare the performance of gradient boosted trees with other machine learning techniques in Spark platform specifically for health care big data and ii) Discuss the results from the experiments conducted on datasets of different characteristics thereby drawing inference and conclusion. The experimental results show that the accuracy is largely dependent on the characteristics of the datasets for other machine learning techniques whereas gradient boosting trees yields reasonably stable results in terms of accuracy without largely depending on the dataset characteristics.
Water End-Use Classification with Contemporaneous Water-Energy Data and Deep Learning Network
Water-related energy is energy use which is directly or indirectly influenced by changes to water use. Informatics applying a range of mathematical, statistical and rule-based approaches can be used to reveal important information on demand from the available data provided at second, minute or hourly intervals. This study aims to combine these two concepts to improve the current water end use disaggregation problem through applying a wide range of most advanced pattern recognition techniques to analyse the concurrent high-resolution water-energy consumption data. The obtained results have shown that recognition accuracies of all end-uses have significantly increased, especially for mechanised categories, including clothes washer, dishwasher and evaporative air cooler where over 95% of events were correctly classified.
Cyber Supply Chain Resilient: Enhancing Security through Leadership to Protect National Security
Cyber criminals are constantly on the lookout for new opportunities to exploit organisation and cause destruction. This could lead to significant cause of economic loss for organisations in the form of destruction in finances, reputation and even the overall survival of the organization. Additionally, this leads to serious consequences on national security. The threat of possible cyber attacks places further pressure on organisations to ensure they are secure, at a time where international scale cyber attacks have occurred in a range of sectors. Stakeholders are wanting confidence that their data is protected. This is only achievable if a business fosters a resilient supply chain strategy which is implemented throughout its supply chain by having a strong cyber leadership culture. This paper will discuss the essential role and need for organisations to adopt a cyber leadership culture and direction to learn about own internal processes to ensure mitigating systemic vulnerability of its supply chains. This paper outlines that to protect national security there is an urgent need for cyber awareness culture change. This is required in all organisations, regardless of their sector or size, to implementation throughout the whole supplier chain to support and protect economic prosperity to make the UK more resilient to cyber-attacks. Through businesses understanding the supply chain and risk management cycle of their own operates has to be the starting point to ensure effective cyber migration strategies.
Generic Early Warning Signals for Program Student Withdrawals: A Complexity Perspective Based on Critical Transitions and Fractals
Complex systems exhibit universal characteristics as they near a tipping point. Among them are common generic early warning signals which precede critical transitions. These signals include: critical slowing down in which the rate of recovery from perturbations decreases over time; an increase in the variance of the state variable; an increase in the skewness of the state variable; an increase in the autocorrelations of the state variable; flickering between different states; and an increase in spatial correlations over time. The presence of the signals has management implications, as the identification of the signals near the tipping point could allow management to identify intervention points. Despite the applications of the generic early warning signals in various scientific fields, such as fisheries, ecology and finance, a review of literature did not identify any applications that address the program student withdrawal problem at the undergraduate distance universities. This area could benefit from the application of generic early warning signals as the program withdrawal rate amongst distance students is higher than the program withdrawal rate at face-to-face conventional universities. This research specifically assessed the generic early warning signals through an intensive case study of undergraduate program student withdrawal at a Canadian distance university. The university is non-cohort based due to its system of continuous course enrollment where students can enroll in a course at the beginning of every month. The assessment of the signals was achieved through the comparison of the incidences of generic early warning signals among students who withdrew or simply became inactive in their undergraduate program of study, the true positives, to the incidences of the generic early warning signals among graduates, the false positives. This was achieved through significance testing. Research findings showed support for the signal pertaining to the rise in flickering which is represented in the increase in the student’s non-pass rates prior to withdrawing from a program; moderate support for the signals of critical slowing down as reflected in the increase in the time a student spends in a course; and moderate support for the signals on increase in autocorrelation and increase in variance in the grade variable. The findings did not support the signal on the increase in skewness of the grade variable. The research also proposes a new signal based on the fractal-like characteristic of student behavior. The research also sought to extend knowledge by investigating whether the emergence of a program withdrawal status is self-similar or fractal-like at multiple levels of observation, specifically the program level and the course level. In other words, whether the act of withdrawal at the program level is also present at the course level. The findings moderately supported self-similarity as a potential signal. Overall, the assessment of the signals suggests that the signals, with the exception with the increase of skewness, could be utilized as a predictive management tool and potentially add one more tool, the fractal-like characteristic of withdrawal, as an additional signal in addressing the student program withdrawal problem.
A Graph Based Stemmer for Arabic Extrinsic Plagiarism Detection
Arabic is one of the most challenging languages in the field of Natural Language Processing (NLP). These challenges are reflected in the difficulties brought by its rich vocabulary and its complex morphology. Stemming as a technique of Arabic NLP is increasingly becoming a significant research domain. Stemming approaches are several: the first one is based on manually made dictionaries, the second one on the morphological analysis of the language, and the third one on statistical studies. In this study, a new graph based-approach for stemming in Arabic documents was proposed. Moreover, an evaluation of this stemmer impact on extrinsic plagiarism detection was elaborated. In this approach, a word is represented by a directed weighted graph having a set of connected components. Each of these components has a specific representation. Then, a stem is selected by comparing the word’s representation with a database of 450 stems. This stemmer showed efficiency by improving the detection process of extrinsic plagiarism which is proved by the results obtained.
Assembly Training: An Augmented Reality Approach Using Design Science Research
Augmented Reality (AR) is a strong growing research topic. This innovative technology is interesting for several training domains like education, medicine, military, sports and industrial use cases like assembly and maintenance tasks. AR can help to improve the efficiency, quality and transfer of training tasks. Due to these reasons, AR becomes more interesting for big companies and researchers because the industrial domain is still an unexplored field. This paper presents the research proposal of a PhD thesis which is done in cooperation with the BMW Group, aiming to explore head-mounted display (HMD) based training in industrial environments. We give a short introduction, describing the motivation, the underlying problems as well as the five formulated research questions we want to clarify along this thesis. We give a brief overview of the current assembly training in industrial environments and present some AR-based training approaches, including their research deficits. We use the Design Science Research (DSR) framework for this thesis and describe how we want to realize the seven guidelines, mandatory from the DSR. Furthermore, we describe each methodology which we use within that framework and present our approach in a comprehensive figure, representing the entire thesis.
Foslip Loaded and CEA-Affimer Functionalised Silica Nanoparticles for Fluorescent Imaging of Colorectal Cancer Cells
Introduction: There is a need for real-time imaging of colorectal cancer (CRC) to allow tailored surgery to the disease stage. Fluorescence guided laparoscopic imaging of primary colorectal cancer and the draining lymphatics would potentially bring stratified surgery into clinical practice and realign future CRC management to the needs of patients. Fluorescent nanoparticles can offer many advantages in terms of intra-operative imaging and therapy (theranostic) in comparison with traditional soluble reagents. Nanoparticles can be functionalised with diverse reagents and then targeted to the correct tissue using an antibody or Affimer (artificial binding protein). We aimed to develop and test fluorescent silica nanoparticles and targeted against CRC using an anti-carcinoembryonic antigen (CEA) Affimer (Aff). Methods: Anti-CEA and control Myoglobin Affimer binders were subcloned into the expressing vector pET11 followed by transformation into BL21 Star™ (DE3) E.coli. The expression of Affimer binders was induced using 0.1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG). Cells were harvested, lysed and purified using nickle chelating affinity chromatography. The photosensitiser Foslip (soluble analogue of 5,10,15,20-Tetra(m-hydroxyphenyl) chlorin) was incorporated into the core of silica nanoparticles using water-in-oil microemulsion technique. Anti-CEA or control Affs were conjugated to silica nanoparticles surface using sulfosuccinimidyl-4-(N-maleimidomethyl) cyclohexane-1-carboxylate (sulfo SMCC) chemical linker. Binding of CEA-Aff or control nanoparticles to colorectal cancer cells (LoVo, LS174T and HC116) was quantified in vitro using confocal microscopy. Results: The molecular weights of the obtained band of Affimers were ~12.5KDa while the diameter of functionalised silica nanoparticles was ~80nm. CEA-Affimer targeted nanoparticles demonstrated 9.4, 5.8 and 2.5 fold greater fluorescence than control in, LoVo, LS174T and HCT116 cells respectively (p < 0.002) for the single slice analysis. A similar pattern of successful CEA-targeted fluorescence was observed in the maximum image projection analysis, with CEA-targeted nanoparticles demonstrating 4.1, 2.9 and 2.4 fold greater fluorescence than control particles in LoVo, LS174T, and HCT116 cells respectively (p < 0.0002). There was no significant difference in fluorescence for CEA-Affimer vs. CEA-Antibody targeted nanoparticles. Conclusion: We are the first to demonstrate that Foslip-doped silica nanoparticles conjugated to anti-CEA Affimers via SMCC allowed tumour cell-specific fluorescent targeting in vitro, and had shown sufficient promise to justify testing in an animal model of colorectal cancer. CEA-Affimer appears to be a suitable targeting molecule to replace CEA-Antibody. Targeted silica nanoparticles loaded with Foslip photosensitiser is now being optimised to drive photodynamic killing, via reactive oxygen generation.
Vegetation Index-Deduced Crop Coefficient of Wheat (Triticum aestivum) Using Remote Sensing: Case Study on Four Basins of Golestan Province, Iran
Crop coefficient (Kc) is an important factor contributing to estimation of evapotranspiration, and is also used to determine the irrigation schedule. This study investigated and determined the monthly Kc of winter wheat (Triticum aestivum L.) using five vegetation indices (VIs): Normalized Difference Vegetation Index (NDVI), Difference Vegetation Index (DVI), Soil Adjusted Vegetation Index (SAVI), Infrared Percentage Vegetation Index (IPVI), and Ratio Vegetation Index (RVI) of four basins in Golestan province, Iran. 14 Landsat-8 images according to crop growth stage were used to estimate monthly Kc of wheat. VIs were calculated based on infrared and near infrared bands of Landsat 8 images using Geographical Information System (GIS) software. The best VIs were chosen after establishing a regression relationship among these VIs with FAO Kc and Kc that was modified for the study area by the previous research based on R² and Root Mean Square Error (RMSE). The result showed that local modified SAVI with R²= 0.767 and RMSE= 0.174 was the best index to produce monthly wheat Kc maps.
Variation of Phytoplankton Biomass in the East China Sea Based on MODIS Data
The East China Sea is one of four main seas in China, where there are many fishery resources. Some important fishing grounds, such as Zhousan fishing ground important to society. But the eco-environment is destroyed seriously due to the rapid developing of industry and economy these years. In this paper, about twenty-year satellite data from MODIS and the statistical information of marine environment from the China marine environmental quality bulletin were applied to do the research. The chlorophyll-a concentration data from MODIS were dealt with in the East China Sea and then used to analyze the features and variations of plankton biomass in recent years. The statistics method was used to obtain their spatial and temporal features. The plankton biomass in the Yangtze River estuary and the Taizhou region were highest. The high phytoplankton biomass usually appeared between the 88th day to the 240th day (end-March - August). In the peak time of phytoplankton blooms, the Taizhou islands was the earliest, and the South China Sea was the latest. The intensity and period of phytoplankton blooms were connected with the global climate change. This work give us confidence to use satellite data to do more researches about the China Sea, and it also provides some help for us to know about the eco-environmental variation of the East China Sea and regional effect from global climate change.
Landcover Mapping Using Lidar Data and Aerial Image and Soil Fertility Degradation Assessment for Rice Production Area in Quezon, Nueva Ecija, Philippines
Land-cover maps were important for many scientific, ecological and land management purposes and during the last decades, rapid decrease of soil fertility was observed to be due to land use practices such as rice cultivation. High-precision land-cover maps are not yet available in the area which is important in an economy management. To assure accurate mapping of land cover to provide information, remote sensing is a very suitable tool to carry out this task and automatic land use and cover detection. The study did not only provide high precision land cover maps but it also provides estimates of rice production area that had undergone chemical degradation due to fertility decline. Land-cover were delineated and classified into pre-defined classes to achieve proper detection features. After generation of Land-cover map, of high intensity of rice cultivation, soil fertility degradation assessment in rice production area due to fertility decline was created to assess the impact of soils used in agricultural production. Using Simple spatial analysis functions and ArcGIS, the Land-cover map of Municipality of Quezon in Nueva Ecija, Philippines was overlaid to the fertility decline maps from Land Degradation Assessment Philippines- Bureau of Soils and Water Management (LADA-Philippines-BSWM) to determine the area of rice crops that were most likely where nitrogen, phosphorus, zinc and sulfur deficiencies were induced by high dosage of urea and imbalance N:P fertilization. The result found out that 80.00 % of fallow and 99.81% of rice production area has high soil fertility decline.
Extraction of Forest Plantation Resources in Selected Forest of San Manuel, Pangasinan, Philippines Using LiDAR Data for Forest Status Assessment
Forest inventories are essential to assess the composition, structure and distribution of forest vegetation that can be used as baseline information for management decisions. Classical forest inventory is labor intensive and time-consuming and sometimes even dangerous. The use of Light Detection and Ranging (LiDAR) in forest inventory would improve and overcome these restrictions. This study was conducted to determine the possibility of using LiDAR derived data in extracting high accuracy forest biophysical parameters and as a non-destructive method for forest status analysis of San Manual, Pangasinan. Forest resources extraction was carried out using LAS tools, GIS, Envi and .bat scripts with the available LiDAR data. The process includes the generation of derivatives such as Digital Terrain Model (DTM), Canopy Height Model (CHM) and Canopy Cover Model (CCM) in .bat scripts followed by the generation of 17 composite bands to be used in the extraction of forest classification covers using ENVI 4.8 and GIS software. The Diameter in Breast Height (DBH), Above Ground Biomass (AGB) and Carbon Stock (CS) were estimated for each classified forest cover and Tree Count Extraction was carried out using GIS. Subsequently, field validation was conducted for accuracy assessment. Results showed that the forest of San Manuel has 73% Forest Cover, which is relatively much higher as compared to the 10% canopy cover requirement. On the extracted canopy height, 80% of the tree’s height ranges from 12 m to 17 m. CS of the three forest covers based on the AGB were: 20819.59 kg/20x20 m for closed broadleaf, 8609.82 kg/20x20 m for broadleaf plantation and 15545.57 kg/20x20m for open broadleaf. Average tree counts for the tree forest plantation was 413 trees/ha. As such, the forest of San Manuel has high percent forest cover and high CS.
A Neural Network Based Clustering Approach for Imputing Multivariate Values in Big Data
The treatment of incomplete data is an important step in the data pre-processing. Missing values creates a noisy environment in all applications and it is an unavoidable problem in big data management and analysis. Numerous techniques likes discarding rows with missing values, mean imputation, expectation maximization, neural networks with evolutionary algorithms or optimized techniques and hot deck imputation have been introduced by researchers for handling missing data. Among these, imputation techniques plays a positive role in filling missing values when it is necessary to use all records in the data and not to discard records with missing values. In this paper we propose a novel artificial neural network based clustering algorithm, Adaptive Resonance Theory-2(ART2) for imputation of missing values in mixed attribute data sets. The process of ART2 can recognize learned models fast and be adapted to new objects rapidly. It carries out model-based clustering by using competitive learning and self-steady mechanism in dynamic environment without supervision. The proposed approach not only imputes the missing values but also provides information about handling the outliers.
Evolving Knowledge Extraction from Online Resources
In this paper, we present an evolving knowledge
extraction system named AKEOS (Automatic Knowledge Extraction
from Online Sources). AKEOS consists of two modules, including
a one-time learning module and an evolving learning module.
The one-time learning module takes in user input query, and
automatically harvests knowledge from online unstructured resources
in an unsupervised way. The output of the one-time learning is a
structured vector representing the harvested knowledge. The evolving
learning module automatically schedules and performs repeated
one-time learning to extract the newest information and track the
development of an event. In addition, the evolving learning module
summarizes the knowledge learned at different time points to produce
a final knowledge vector about the event. With the evolving learning,
we are able to visualize the key information of the event, discover
the trends, and track the development of an event.
Causal Relation Identification Using Convolutional Neural Networks and Knowledge Based Features
Causal relation identification is a crucial task in information extraction and knowledge discovery. In this work, we present two approaches to causal relation identification. The first is a classification model trained on a set of knowledge-based features. The second is a deep learning based approach training a model using convolutional neural networks to classify causal relations. We experiment with several different convolutional neural networks (CNN) models based on previous work on relation extraction as well as our own research. Our models are able to identify both explicit and implicit causal relations as well as the direction of the causal relation. The results of our experiments show a higher accuracy than previously achieved for causal relation identification tasks.
Taxonomic Classification for Living Organisms Using Convolutional Neural Networks
Taxonomic classification has a wide-range of applications such as finding out more about the evolutionary history of organisms that can be done by making a comparison between species living now and species that lived in the past. This comparison can be made using different kinds of extracted species’ data which include DNA sequences. Compared to the estimated number of the organisms that nature harbours, humanity does not have a thorough comprehension of which specific species they all belong to, in spite of the significant development of science and scientific knowledge over many years. One of the methods that can be applied to extract information out of the study of organisms in this regard is to use the DNA sequence of a living organism as a marker, thus making it available to classify it into a taxonomy. The classification of living organisms can be done in many machine learning techniques including Neural Networks (NNs). In this study, DNA sequences classification is performed using Convolutional Neural Networks (CNNs) which is a special type of NNs.
Localization of Geospatial Events and Hoax Prediction in the UFO Database
Unidentified Flying Objects (UFOs) have been an interesting topic for most enthusiasts and hence people all over the United States report such findings online at National UFO Report Center (NUFORC). Some of these reports are hoax, and amongst those that seem legitimate, there isn’t currently an established method to confirm that they indeed are events related to flying objects from aliens in outer space. However, the database provides a wealth of information that can be exploited to provide various analyses and insights such as social reporting, identifying real-time spatial events and much more. We perform analysis to localize these time-series geospatial events and correlate with known real-time events. This paper does not confirm any legitimacy of alien activity but rather attempts to gather information from likely legitimate reports of UFOs by studying the online reports. These events happen in geospatial clusters and also are time-based. We present a scheme consisting of feature extraction by filtering related datasets over a time-band of 24 hrs and use multi-dimensional textual summaries along with geospatial information to determine best clusters of UFO activity. Later, we look at cluster density and data visualization to search the space of various cluster realizations to decide best probable clusters that provide us information about proximity of such activity. A random forest classifier is also presented that is used to identify true events and hoax events, using the best possible features available such as region, week, time-period and duration. Lastly, we show the performance of the scheme on various days and correlate with real-time events where one of the UFO reports strongly correlate to a missile test conducted in the United States.
Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques
Sentiment analysis and opinion mining have become
emerging topics of research in recent years but most of the work
is focused on data in the English language. A comprehensive
research and analysis are essential which considers multiple
languages, machine translation techniques, and different classifiers.
This paper presents, a comparative analysis of different approaches
for multilingual sentiment analysis. These approaches are divided
into two parts: one using classification of text without language
translation and second using the translation of testing data to a
target language, such as English, before classification. The presented
research and results are useful for understanding whether machine
translation should be used for multilingual sentiment analysis or
building language specific sentiment classification systems is a better
approach. The effects of language translation techniques, features,
and accuracy of various classifiers for multilingual sentiment analysis
is also discussed in this study.
Modern Detection and Description Methods for Natural Plants Recognition
Green planet is one of the Earth’s names which is known as a terrestrial planet and also can be named the fifth largest planet of the solar system as another scientific interpretation. Plants do not have a constant and steady distribution all around the world, and even plant species’ variations are not the same in one specific region. Presence of plants is not only limited to one field like botany; they exist in different fields such as literature and mythology and they hold useful and inestimable historical records. No one can imagine the world without oxygen which is produced mostly by plants. Their influences become more manifest since no other live species can exist on earth without plants as they form the basic food staples too. Regulation of water cycle and oxygen production are the other roles of plants. The roles affect environment and climate. Plants are the main components of agricultural activities. Many countries benefit from these activities. Therefore, plants have impacts on political and economic situations and future of countries. Due to importance of plants and their roles, study of plants is essential in various fields. Consideration of their different applications leads to focus on details of them too. Automatic recognition of plants is a novel field to contribute other researches and future of studies. Moreover, plants can survive their life in different places and regions by means of adaptations. Therefore, adaptations are their special factors to help them in hard life situations. Weather condition is one of the parameters which affect plants life and their existence in one area. Recognition of plants in different weather conditions is a new window of research in the field. Only natural images are usable to consider weather conditions as new factors. Thus, it will be a generalized and useful system. In order to have a general system, distance from the camera to plants is considered as another factor. The other considered factor is change of light intensity in environment as it changes during the day. Adding these factors leads to a huge challenge to invent an accurate and secure system. Development of an efficient plant recognition system is essential and effective. One important component of plant is leaf which can be used to implement automatic systems for plant recognition without any human interface and interaction. Due to the nature of used images, characteristic investigation of plants is done. Leaves of plants are the first characteristics to select as trusty parts. Four different plant species are specified for the goal to classify them with an accurate system. The current paper is devoted to principal directions of the proposed methods and implemented system, image dataset, and results. The procedure of algorithm and classification is explained in details. First steps, feature detection and description of visual information, are outperformed by using Scale invariant feature transform (SIFT), HARRIS-SIFT, and FAST-SIFT methods. The accuracy of the implemented methods is computed. In addition to comparison, robustness and efficiency of results in different conditions are investigated and explained.
Learning Dynamic Representations of Nodes in Temporally Variant Graphs
In many industries, including telecommunications, churn prediction has been a topic of active research. A lot of attention has been drawn on devising the most informative features, and this area of research has gained even more focus with spread of (social) network analytics. The call detail records (CDRs) have been used to construct customer networks and extract potentially useful features. However, to the best of our knowledge, no studies including network features have yet proposed a generic way of representing network information. Instead, ad-hoc and dataset dependent solutions have been suggested. In this work, we build upon a recently presented method (node2vec) to obtain representations for nodes in observed network. The proposed approach is generic and applicable to any network and domain. Unlike node2vec, which assumes a static network, we consider a dynamic and time-evolving network. To account for this, we propose an approach that constructs the feature representation of each node by generating its node2vec representations at different timestamps, concatenating them and finally compressing using an auto-encoder-like method in order to retain reasonably long and informative feature vectors. We test the proposed method on churn prediction task in telco domain. To predict churners at timestamp ts+1, we construct training and testing datasets consisting of feature vectors from time intervals [t1, ts-1] and [t2, ts] respectively, and use traditional supervised classification models like SVM and Logistic Regression. Observed results show the effectiveness of proposed approach as compared to ad-hoc feature selection based approaches and static node2vec.
Road Accidents Bigdata Mining and Visualization Using Support Vector Machines
Useful information has been extracted from the
road accident data in United Kingdom (UK), using data analytics
method, for avoiding possible accidents in rural and urban areas.
This analysis make use of several methodologies such as data
integration, support vector machines (SVM), correlation machines
and multinomial goodness. The entire datasets have been imported
from the traffic department of UK with due permission. The
information extracted from these huge datasets forms a basis for
several predictions, which in turn avoid unnecessary memory
lapses. Since data is expected to grow continuously over a period
of time, this work primarily proposes a new framework model
which can be trained and adapt itself to new data and make
accurate predictions. This work also throws some light on use of
SVM’s methodology for text classifiers from the obtained traffic
data. Finally, it emphasizes the uniqueness and adaptability of
SVMs methodology appropriate for this kind of research work.
A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database
The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.
Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency
Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.
Monitoring the Change of Padma River Bank at Faridpur, Bangladesh Using Remote Sensing Approach
Bangladesh is often called as a motherland of rivers. It contains about 700 rivers among all these the Padma River is one of the largest rivers of Bangladesh. The change of river bank and erosion has become a common environmental natural hazard in Bangladesh. The river banks are under intense pressure from natural processes such as erosion and accretion as well as anthropogenic processes such as urban growth and pollution. The Padma River is flowing along ten districts of Bangladesh among all these Faridpur district is most vulnerable to river bank erosion. The severity of the river erosion is so high that each year a thousand of populations become homeless and lose their agricultural lands. Though the Faridpur district is most vulnerable to river bank erosion no specific research has been conducted to identify the changing pattern of river bank along this district. The outcome of the research may serve as guidance to prepare river bank monitoring program and management. This research has utilized integrated techniques of remote sensing and geographic information system to monitor the changes from 1995 to 2015 at Faridpur district. To discriminate the land water interface Modified Normalized Difference Water Index (MNDWI) algorithm is applied and on screen digitization approach is used over MNDWI images of 1995, 2002 and 2015 for river bank line extraction. The extent of changes in the river bank along Faridpur district is estimated through overlaying the digitized maps of all three years. The river bank lines are highlighted to infer the erosion and accretion and the changes are calculated. The result shows that the middle of the river is gaining land through sedimentation and the both side river bank is shifting causing severe erosion that consequently resulting the loss of farmland and homestead. Over the study period from 1995 to 2015 it witnessed huge erosion and accretion that played an active role in the changes of the river bank.
Remote Sensing of Urban Land Cover Change: Trends, Driving Forces, and Indicators
This study was conducted in the Kansas City metropolitan area of the United States, which has experienced significant urban sprawling in recent decades. The remote sensing of land cover changes in this area spanned over four decades from 1972 through 2010. The project was implemented in two stages: the first stage focused on detection of long-term trends of urban land cover change, while the second one examined how to detect the coupled effects of human impact and climate change on urban landscapes. For the first-stage study, six Landsat images were used with a time interval of about five years for the period from 1972 through 2001. Four major land cover types, built-up land, forestland, non-forest vegetation land, and surface water, were mapped using supervised image classification techniques. The study found that over the three decades the built-up lands in the study area were more than doubled, which was mainly at the expense of non-forest vegetation lands. Surprisingly and interestingly, the area also saw a significant gain in surface water coverage. This observation raised questions: How have human activities and precipitation variation jointly impacted surface water cover during recent decades? How can we detect such coupled impacts through remote sensing analysis? These questions led to the second stage of the study, in which we designed and developed approaches to detecting fine-scale surface waters and analyzing coupled effects of human impact and precipitation variation on the waters. To effectively detect urban landscape changes that might be jointly shaped by precipitation variation, our study proposed “urban wetscapes” (loosely-defined urban wetlands) as a new indicator for remote sensing detection. The study examined whether urban wetscape dynamics was a sensitive indicator of the coupled effects of the two driving forces. To better detect this indicator, a rule-based classification algorithm was developed to identify fine-scale, hidden wetlands that could not be appropriately detected based on their spectral differentiability by a traditional image classification. Three SPOT images for years 1992, 2008, and 2010, respectively were classified with this technique to generate the four types of land cover as described above. The spatial analyses of remotely-sensed wetscape changes were implemented at the scales of metropolitan, watershed, and sub-watershed, as well as based on the size of surface water bodies in order to accurately reveal urban wetscape change trends in relation to the driving forces. The study identified that urban wetscape dynamics varied in trend and magnitude from the metropolitan, watersheds, to sub-watersheds in response to human impacts at different scales. The study also found that increased precipitation in the region in the past decades swelled larger wetlands in particular while generally smaller wetlands decreased mainly due to human development activities. These results confirm that wetscape dynamics can effectively reveal the coupled effects of human impact and climate change on urban landscapes. As such, remote sensing of this indicator provides new insights into the relationships between urban land cover changes and driving forces.
Evaluate the Possibility of Using ArcGIS Basemaps as GCP for Large Scale Maps
Awareness of the importance large-scale maps for development of a country is growing in all walks of life, especially for governments in Indonesia. Various parties, especially local governments throughout Indonesia demanded for immediate availability the large-scale maps of 1:5000 for regional development. But in fact, the large-scale maps of 1:5000 is only available less than 5% of the entire territory of Indonesia. Unavailability precise GCP at the entire territory of Indonesia is one of causes of slow availability the large scale maps of 1:5000. This research was conducted to find an alternative solution to this problem. This study was conducted to assess the accuracy of ArcGIS base maps coordinate when it shall be used as GCP for creating a map scale of 1:5000. The study was conducted by comparing the GCP coordinate from Field survey using GPS Geodetic than the coordinate from ArcGIS basemaps in various locations in Indonesia. Some areas are used as a study area are Lombok Island, Kupang City, Surabaya City and Kediri District. The differences value of the coordinates serve as the basis for assessing the accuracy of ArcGIS basemaps coordinates. The results of the study at various study area show the variation of the amount of the coordinates value given. Differences coordinate value in the range of millimeters (mm) to meters (m) in the entire study area. This is shown the inconsistency quality of ArcGIS base maps coordinates. This inconsistency shows that the coordinate value from ArcGIS Basemaps is careless. The Careless coordinate from ArcGIS Basemaps indicates that its cannot be used as GCP for large-scale mapping on the entire territory of Indonesia.
Spatio-Temporal Variation of Suspended Sediment Concentration in the near Shore Waters, Southern Karnataka, India
Suspended Sediment Concentration (SSC) was estimated for the period of four months (November, 2013 to February 2014) using Oceansat-2 (Ocean Colour Monitor) satellite images to understand the coastal dynamics and regional sediment transport, especially distribution and budgeting in coastal waters. The coastal zone undergoes continuous changes due to natural processes and anthropogenic activities. The importance of the coastal zone, with respect to safety, ecology, economy and recreation, demands a management strategy in which each of these aspects is taken into account. Monitoring and understanding the sediment dynamics and suspended sediment transport is an important issue for coastal engineering related activities. A study of the transport mechanism of suspended sediments in the near shore environment is essential not only to safeguard marine installations or navigational channels, but also for the coastal structure design, environmental protection and disaster reduction. Such studies also help in assessment of pollutants and other biological activities in the region. An accurate description of the sediment transport, caused by waves and tidal or wave-induced currents, is of great importance in predicting coastal morphological changes. Satellite-derived SSC data have been found to be useful for Indian coasts because of their high spatial (360 m), spectral and temporal resolutions. The present paper outlines the applications of state‐of‐the‐art operational Indian Remote Sensing satellite, Oceansat-2 to study the dynamics of sediment transport.
Assessing the Spatial Distribution of Urban Parks Using Remote Sensing and Geographic Information Systems Techniques
Urban parks and open spaces play a significant role in improving physical and mental health of the citizens, strengthen the societies and make the cities more attractive places to live and work. As the world’s cities continue to grow, continuing to value green space in cities is vital but is also a challenge, particularly in developing countries where there is pressure for space, resources, and development. Offering equal opportunity of accessibility to parks is one of the important issues of park distribution. The distribution of parks should allow all inhabitants to have close proximity to their residence. Remote sensing and Geographic information systems (GIS) can provide decision makers with enormous opportunities to improve the planning and management of Park facilities. This study exhibits the capability of GIS and RS techniques to provide baseline knowledge about the distribution of parks, level of accessibility and to help in identification of potential areas for such facilities. For this purpose Landsat OLI imagery for year 2016 was acquired from USGS Earth Explorer. Preprocessing models were applied using Erdas Imagine 2014v for the atmospheric correction and NDVI model was developed and applied to quantify the land use/land cover classes including built up, barren land, water, and vegetation. The parks amongst total public green spaces were selected based on their signature in remote sensing image and distribution. Percentages of total green and parks green were calculated for each town of Lahore City and results were then synchronized with the recommended standards. ANGSt model was applied to calculate the accessibility from parks. Service area analysis was performed using Network Analyst tool. Serviceability of these parks has been evaluated by employing statistical indices like service area, service population and park area per capita. Findings of the study may contribute in helping the town planners for understanding the distribution of parks, demands for new parks and potential areas which are deprived of parks. The purpose of present study is to provide necessary information to planners, policy makers and scientific researchers in the process of decision making for the management and improvement of urban parks.
The Enhancement of Target Localization Using Ship-Borne Electro-Optical Stabilized Platform
Electro-optical (EO) stabilized platforms have been widely used for surveillance and reconnaissance on various types of vehicles, from surface ships to unmanned air vehicles (UAVs). EO stabilized platforms usually consist of an assembly of structure, bearings, and motors called gimbals in which a gyroscope is installed. EO elements such as a CCD camera and IR camera, are mounted to a gimbal, which has a range of motion in elevation and azimuth and can designate and track a target. In addition, a laser range finder (LRF) can be added to the gimbal in order to acquire the precise slant range from the platform to the target. Recently, a versatile functionality of target localization is needed in order to cooperate with the weapon systems that are mounted on the same platform. The target information, such as its location or velocity, needed to be more accurate. The accuracy of the target information depends on diverse component errors and alignment errors of each component. Specially, the type of moving platform can affect the accuracy of the target information. In the case of flying platforms, or UAVs, the target location error can be increased with altitude so it is important to measure altitude as precisely as possible. In the case of surface ships, target location error can be increased with obliqueness of the elevation angle of the gimbal since the altitude of the EO stabilized platform is supposed to be relatively low. The farther the slant ranges from the surface ship to the target, the more extreme the obliqueness of the elevation angle. This can hamper the precise acquisition of the target information. So far, there have been many studies on EO stabilized platforms of flying vehicles. However, few researchers have focused on ship-borne EO stabilized platforms of the surface ship. In this paper, we deal with a target localization method when an EO stabilized platform is located on the mast of a surface ship. Especially, we need to overcome the limitation caused by the obliqueness of the elevation angle of the gimbal. We introduce a well-known approach for target localization using Unscented Kalman Filter (UKF) and present the problem definition showing the above-mentioned limitation. Finally, we want to show the effectiveness of the approach that will be demonstrated through computer simulations.