Publications - Tiziran

Tiziran
Computer Vision
Deep Learning
Go to content

Main menu:

Publications


JOURNAL PAPERS:
2014 Adaptive Image Thresholding based On the Peak Signal-To-Noise Ratio, Research Journal of Applied Sciences, Engineering and Technology 8(9).
DOI: 10.19026/rjaset.8.1074

The aim of this research is to enhance a Peak signal Noise Ratio based thresholding algorithm. Thresholding is a critical step in pattern recognition and has a significant effect on the subsequent steps in imaging applications. Thresholding is used to separate objects from the background and decreases the amount of data and increases the computational speed. Recently, there has been an increased interest in multilevel thresholding. However, as the number of levels increases, the computation time increases. In addition, single threshold methods are faster than multilevel methods. Moreover, for each new application, new methods must be developed. In this study, a new algorithm that applies the peak signal-to-noise ratio method as an indicator to segment the image is proposed. The algorithm was tested using the license plate recognition system, DIBCO, 2009 and standard images. The proposed algorithm is comparable to existing methods when applied to Malaysian vehicle images. The proposed method performs better than earlier methods, such as Kittler and Illingworth's Minimum Error Thresholding, potential difference and Otsu. In general, the proposed algorithm yields better results for standard images. In the license plate recognition application, the new method yielded an average performance.
The proposed adaptive threshold method, based on the Peak Signal!to!Noise Ratio (PSNR), has the potential to be applied in all domains, such as LPR and OCR. Based on the experiments, the proposed algorithm achieves competitive results in four databases, including Malaysian vehicle, standard, printed and handwritten images. The proposed algorithms achieve better results compared with old er methods. However it produced slightly worse results compared to newer methods, such as multi!level thresholding. In addition, the multi threshold tech nique does not work in real!time systems, but works in th e LPR system. With other databases, the results of th e proposed method are satisfactory for global images. Recently, PSNR has been widely used as a stopping criterion in multilevel threshold methods for segmenting images. Alternatively, we have applied t he PSNR as a criterion to determine the most suitable threshold value. We evaluated the proposed method with the license!plate recognition system. At the s ame time, we compared the proposed method with state!of ! the!art multilevel and multi!threshold methods. The proposed method produced acceptable results in all conditions, such as different contrast or brightnes s. The older methods, such as Otsu, Kittler and Illingworth, Max entropy and potential difference, are still valid. However, the newer methods, like multi ! threshold and multilevel (recursive) thresholding, perform better in special usage/domains. Unlike the se other methods, the proposed method yielded average result in all domains. The objective of this research was to develop a new single adaptive thresholding algorithm that works f or a wide range of pattern recognition applications. The proposed method has been implemented in four different types of applications and compared with o ther methods. The results show that the proposed algorit hm achieves the objective because it has obtained reasonable results in all four areas/domains. A recently developed Malaysian LPR system uses a fixed threshold to segment the number plate and t he characters. Experiments proved that via a Taylor!ma de thresholding method, the algorithm can be improved significantly. Clearly, the proposed method has bee n tested off!line. Another advantage of the proposed approach is that the adaptive threshold values can be changed according to the environment, such as for h igh or low contrast encountered during photographing at night, mid!day, underground or on a rainy day. The proposed algorithm is suitable for use in the LPR system and is competitive with the newer method s for LPR. Because of its low accuracy in the LPR system, it is suggested that we do not include the Otsu method in future studies. The PSNR of the proposed method was better than that of the Kittler and Illingworth PSNR on standard database images. The Otsu method, which performed poorly in the LPR system, is adequate for producing a PSNR evaluation of standard images.

2013 Simultaneous Localization and Mapping Trends and Humanoid Robot Linkages, Asia-Pacific Journal of Information Technology and Multimedia (APJITM), 2013
Farshid Pirahansiah, and Siti Norul Huda Sheikh Abdullah, and Shahnorbanun Sahran, (2013) Simultaneous localization and mapping trends and humanoid robot linkages. Asia-Pacific Journal of Information Technology and Multimedia, 2 (2). pp. 27-38. ISSN 2289-2192
Simultaneous localization and mapping (SLAM), also known as concurrent mapping and localization (CML), is an important topic or robotics files. This method produces a real-time map of an environment and finds the current position of a robot on that map. This method is generally used to solve the problem of “Where am I?” for localization, “Where do I go?” for goal determination, and “How do I go there?” for robot motion planning. Recently, the number of studies in this area has increased rapidly and expanded to different areas. In this paper analyzes SLAM or CML, which is currently a hot topic in the field of robotic research. In addition, this paper describes methods for solving SLAM problems, presents evaluation methods for SLAM, analyzes recent research on SLAM worldwide, and studies the academic importance of SLAM. This paper also reviews the use of SLAM for humanoid robots and aims to address the issue of the significance of SLAM engine in the future of stereo vision on humanoid robots.
New combinatory methods, such as grid-based FastSLAM (Stachniss et al., 2005) and graphbased SLAM, with landmarks (Grisetti et al., 2010), have been proposed recently. Some recent studies on stereo vision SLAM, such as near real-time learning of 3D point-landmark and 2D occupancy-grid maps that use particle filters laser-range data usage for 3D SLAM in outdoor environments (Cole & Newman, 2006), detailed 3D mapping based on image edgepoint ICP and recovery from registration failure (Tomono, 2009), gamma-SLAM that uses stereo vision and variance grid maps for SLAM in unstructured environments (Marks et al., 2008), and SLAM for autonomous mobile robots that use a binocular stereo vision system (Lu-Fang, Yu-Xian & Sheng, 2007), have also been reported. The grid-based FastSLAM solves the loop-closure problem in SLAM. The graph-based SLAM with landmarks increases map accuracy as well as solve loop closure. The occupancy grids divide the environment into small cells with a predefined size and classify them as occupied or not, and its variants would result in an impracticably large-state vector. The importance and effectiveness of these techniques are undeniable. However, less effort has been dedicated to the area of 3D SLAM on humanoid robots. One study used Rao-Blackwellized particle filter (Kwak et al., 2009) on a humanoid robot (Kaneko et al., 2004), and another used stereo vision (Tomita et al., 1998). These studies found that map and stereo vision are very noisy (Kwak et al., 2009) and need to be improved. One cause of noisy vision is shaky video caused by the movement of a humanoid robot, which causes difficulties in recognizing and detecting objects. This is one issue that should be addressed because the real world is full of moving objects. Future work should be devoted to the application of the system to stereo vision SLAM for humanoid robots in real 3D environments. The robots have to interact with a 3D environment and need metric data to conduct path planning; thus, they require a 3D environment map. Among SLAM methods, the grid-based method is suitable for our humanoid robot. The feature-based SLAM is efficient for localization but cannot work properly for unknown features and path planning (Kwak et al., 2009). For the stereo vision, the first step is to set up a stereo camera on a fixed baseline and calibrate it. As the robot moves, the camera needs to be stabilized to ensure more accurate vision. Hence, a 3D feature should be included to ensure correct recognition and localization of objects. A landmark that will be used for SLAM will then be selected based on 3D features. KF will be applied to SLAM by using landmark selection, and a 3D environment map can then be created. Research 35 on SLAM can greatly benefit the field of robotics and should be given more attention. Future studies can involve stereo video stabilization as well as focus on SLAM with 3D vision for humanoid robots. Highly critical places should be a focus of attention; for example, during natural disasters, a highly critical place is one where people are located.
2013 Peak Signal-To-Noise Ratio Based On Threshold Method For Image Segmentation, (2013). Journal of Theoretical & Applied Information Technology, 57(2).
Binarization or thresholding is one problem that must be solved in pattern recognition and it has a very important influence on the sequent steps in imaging applications. Thresholding is used to separate objects from the background, and diminish the amount of data alter the computational sp eed. Recently, interest in multilevel thresholding has been altered. However, when the levels are altered, the computation time alters so single threshold methods are accelerated than multilevel methods. Moreover, for every new application, new methods are is acquired. In this work, a new algorithm which used the gain signal -to-noise ratio method as an indicator to segment the image is aimed. The algorithm which is used the DIBCO 2011 in printed and a handwritten image was tested. This method has a better p erformance than new methods, such as Kittler and Illingworth's Minimum Error Thresholding, potential difference and Otsu.
The proposed adaptive threshold method, based on the peak signal -to-noise ratio (PSNR), has the potential to be applied in OCR. Based on the experiments, the proposed algorithm achieves competitive results in standard, printed, and handwritten images. The proposed algorithms achieve better results compared with previous methods. However it produced slightly worse results compared to newer methods, such as multi - level thresholding. Recently, PSNR has been widely used as a stopping criterion in multilevel thr eshold methods for segmenting images. Alternatively, we have applied the PSNR as a criterion to determine the most suitable threshold value. We evaluated the proposed method with the license -plate recognition system. At the same time, we compared the proposed method with state -of-the -art multilevel and multi -threshold methods. The proposed method produced acceptable results in all conditions, such as different contrast or brightness.
2013 Character recognition based on global feature extraction; Journal of Theoretical and Applied Information Technology 52.
Optical character recognition (OCR) is one of the most important fiel ds in pattern recognition world which is able to recognize handwritten characters, irregular characters and machine printed characters. Optical character recognition system consists of five major tasks which are involved pre- processing, segmentation, featu re extraction, classification and recognition. Generally, less discriminative features in global feature approach leads to reduce in recognition rate. By proposing a global approach that produces more discriminative features and less dimensionality of data, these problems are overcome. Two feature extraction methods are studied namely Gray Level Co -occurrence Matrix (GLCM) and edge direction matrix (EDMS) and combination of two popular feature extraction methods is proposed. The most important problem of ED MS is the number of produced features with this method which is just 18 features and is not enough for feature extraction purpose and it causes reducing the recognition rate. The aim of this research is improving the recognition rate of EDMS by combining w ith a global feature extraction method in order to increase the number of extracted feature and produce better recognition rate. The proposed method is a combination of GLCM and EDMS method with and without feature selection method called gain ratio and ra nker search which applied to reduce the dimensionality of data. They have been tested onto four different datasets involving manually and automatically cropped licence plate and font style images amounting 3520 images from 0 to 9 and capital letter A to Z with various size and shape. Another dataset is large binary images of shapes which involve 300 images of objects. Then, gain ratio and ranker search are used to select discriminative features whereby the features reduced from 58 to 34 numbers of features. The proposed combinatory method, EDMS and GLCM methods are classified using neural network, Bayes network and decision tree. The experimental results for character recognition indicate that the proposed combinatorial method obtain better average accuracy rate about 85.99% whereas EDMS, GLCM and combination without feature selection achieved 80.19%, 38.84%, and 58.78% subsequently and the experimental results for object recognition indicates that proposed method before and after feature selection outperfor med other methods such as EDMS and GLCM with 90.83% and 92.5% accuracy rate respectively with NN as the classifier. Consequently global and spatial approaches are compared in recognition of objects and characters. The experimental results show the better p erformance of proposed method as a global feature extraction method for object recognition purpose with 92.5% accuracy rate with NN while Robinson filter as a spatial feature extraction method outperformed the global feature extraction methods for character recognition purpose with 100% accuracy rate with NN. Also global approach obtains smaller processing time comparing to spatial approach.
In this paper, the overall explanation for optical character recognition systems and feature extraction is given and plenty number of important and relevant feature extraction techniques are discussed. The proposed f eature extraction methodology is explained and one spatial feature extraction method which is Robinson filter is introduced, also the framework of proposed method is proposed. The experimental results which are concerned to GLCM, EDMS and proposed method before and after feature selection applied to license plate and font style images dataset is represented and the performance comparison is proposed. Furthermore some experiments have been performed in order to indicate the performance of spatial and global feature extraction techniques for character and object recognition purpose. The results of proposed method as a global feature extraction technique are compared with the Robinson filter as a spatial feature extraction technique for object and character dat asets which are large binary images of shape and automatically cropped license plate images respectively. All global feature extraction methods such as EDMS, GLCM and proposed method before and after feature extraction have been tested with this standard d ataset. The experimental results indicates that proposed method before and after feature selection outperformed other methods such as EDMS and GLCM with 90.83% and 92.5% accuracy rate respectively with NN as the classifier. A comparative study for the appl ication of the spatial and global feature extraction methods in recognition of objects and characters is performed. For this purpose Robinson filter as a spatial feature extraction method applied to large binary shape images dataset for object recognition and automatically cropped license plate images dataset for character recognition as well as proposed method after and before feature selection as a global feature extraction method. The experimental results show the better performance of proposed method as a global feature extraction method for object recognition purpose with 92.5% accuracy rate with NN while Robinson filter as a spatial feature extraction method outperformed the global feature extraction methods for character recognition purpose with 100% accuracy rate with NN. As a result, it is arguable that global feature extraction methods are more appropriate for object recognition while spatial feature extraction methods have better performance for character recognition. The processing time of global and spatial feature extraction has been compared as well. The experimented Spatial feature extraction method is extremely time consuming to compare with global feature extraction methods, because it produces more features and has high dimensionality of dat a and takes more time for process. With applying the global feature extraction methods, the dimensionality of data is reduced and the classification performance increased. The proposed feature extraction might be applied in other image processing applicat ion such as handwritten image recognition, printed images and so on. One of the objectives for future works in this research is to increase the recognition rate by modifying the selected features and to develop this method in other areas of research. As far as the good performance of this method on recognition of special characters with circle and curve, this property can be improved and applied for recognition of the complex characters in complex text. The proposed method before and after feature extractio n can be applied for the object recognition while it produced better results for object recognition comparing with other experimented methods.

PROCEEDING:
2015 Augmented optical flow methods for video stabilization. 4th Artificial Intelligence Technology Postgraduate Seminar (CAITPS 2015)
The video data is one of the most important and useful information in computer vision applications. Since vibration of the video occurs unavoidably when camera is moving, video stabilization is an important function for computer vision and its removes the unwanted motion from the camera. This image sequence enhancement is necessary to improve the performance of the subsequently complicated image processing algorithms. With consumer video cameras, video stabilization has been developed in several years. For video stabilization system, various techniques have been proposed to estimate motion vectors of view and the robust method use optical flow technique to obtain motion vectors. The key challenge in optical flow is pre-setting parameters of Gaussian pyramid. In this research enhancement optical flow (OF) method based on Type 2 Fuzzy (T2FOF) for video stabilization is presented. The proposed method is comparing with state of the art based on well-known benchmarks such as, SINTEL and Middlebury benchmark. T2FOF method outperformed Farneback’s method after experimenting on MPI Sintel datasets with 4.54 and 7.19 of average spatiotemporal angular errors consecutively.
The experiment were conducted on MPI SINTEL benchmark datasets. Four parts of SINTEL datasets, Alley, Ambush, Bamboo, and Cave are used to compare proposed method with Farneback’s method [3]. The Large motion, motion blur, and defocus blur issue are solved in SINTEL dataset, before which the proposed method yields less error relatively to the Farneback’s method [3]. In this research enhancement optical flow (OF) method based on Type 2 Fuzzy (T2FOF) and enhances video stabilization by proposed T2FOF are presented, which is to be compared with top-notch renowned benchmarks, SINTEL. Table 1 shows the result of each parts of SINTEL dataset with overall performance.
2015 Auto-Calibration for Multi-Modal Robot Vision based on Image Quality Assessment, the 10th Asian Control Conference (ASCC 2015)
Multi-dimension robot vision in autonomous humanoid robot is still an open issue as it performs less effective when dealing with different environments. Robot vision becomes more challenging as image quality degrades. Unlike human vision, current robot vision is yet to calibrate automatically when image quality changes abruptly. This may result in poor accuracy due to false negative input data points, and the user needs recapturing new calibration images to compensate. Therefore, this study emphasizes on proposing an automatic calibration for multimodal robot vision based on quality measures. We organize our research methodology into three steps. First, we capture a series of image patterns by using our calibration pattern equipment. Second, we employ Image Quality Assessment Function (IQAF) that includes PSNR and SSIM to measure points of image abruption simultaneously. In the experiment, we observed differences between real distance and computed distance and compared them to those of the selfcollected original database and the blur database.
The contribution of this paper is the application of two different methods, namely, full reference IQA, to create calibration pattern datasets of images for CC. Future works will use a mathematical optimization or feasibility program wherein some or all variables are restricted to integers. Many settings refer to this program as an integer linear programming, wherein the objective function and constraints (other than the integer constraints) are made linear to improve CC speed.

2012 2D versus 3D Map for Environment Movement Objects, 2nd National Doctoral Seminar in Artificial Intelligence Technology (CAIT 2012)
The ability of a robot to localize itself and simultaneously build a map of its environment (Simultaneous Localization and Mapping (SLAM) or Concurrent Mapping and Localization (CML)) is a fundamental characteristic required for its autonomous operation. Recently Vision Sensors/Cameras that has been widely used are low-cost, light and compact, easily available, offer passive sensing, have low power consumption and provide rich information about the environment enabling the detection of stable features.  As a SLAM system starts, landmarks for SLAM can be initialized in an un-delayed manner. By contrast, the landmarks are initialized with some delay when a single camera is used to perform SLAM without the use of any artificial target because multiple acquisitions from a single camera are required to compute 3D location of the observed features. Different algorithms have been used to perform SLAM including Extended Kalman Filtering, Particle Filtering, biologically inspired techniques like RatSLAM, and others like Local Bundle Adjustment. Kalman filters have become a standard approach for reducing errors in a least squares sense and using measurements from different sources. Moreover, Kalman filter has been widely employed as a part of vision development in robots. Using visual measurements that contains noises and uncertainties captured is an important role of this filter in the SLAM.(Prabuwono & Idris 2008). In this paper , we aim to propose a localization method for moving obstacle based vision guided and 3D Map framework for 3D moving object based on Scale Invariant Feature Transform (SIFT).

2011 Comparison Single Thresholding Method for Image Segmentation on Handwritten Images, International Conference on Pattern Analysis and Intelligent Robotics, Putrajaya, Malaysia
Thresholding is one of the critical steps in pattern recognition and has a significant effect on the upcoming steps of image application, the important objectives of thresholding are as follows, separating objects from background, decreasing the capacity of data consequently increases speed. Handwritten recognition is one of the important issues, which have various applications in mobile devices. Peak signal noise ratio (PSNR) is one of the methods for measurement the quality of images. Our proposed method applies peak signal noise ratio (PSNR) as one of the indicator to segment the images. We also compare our proposed method with other existing methods and the results are comparable. This algorithm can be optimized to increase the performance. The result indicates that the proposed method works in average handwritten images because the PSNR value of proposed method is better than other methods.
In this paper some comparison thresholding methods have been done. In ea value, and threshold value are calculated. T compare all methods in thresholding because of domain that methods apply on them. Resul proposed algorithm is comparable in handw also showed better results compared to the ol Kittler and Illingworth’s MET, Potential dif However, it is slightly worst in comparis methods such as Multilevel and Multi thresh can be concluded that the performance algorithm gave us a better results in hand writ
2011 License Plate Recognition With Multi-Threshold Based on Entropy, 3rd International Conference on Electrical Engineering and Informatics (ICEEI 2011), Bandung, Indonesia
Among all the existing segmentation techniques, thresholding technique is one of the most popular one due to its simplicity, robustness and accuracy. Multi-thresholding is an important operation in many analyses which is used in many applications. Selecting correct thresholds to get better result is a critical issue. In this research, a multilevel thresholding method is proposed based on combination of maximum entropy. The maximum entropy thresholding algorithm selects several threshold values by maximizing the cross entropy between the original image and the segmented image. This method can effectively integrate partial range of the image histogram. The proposed algorithm is compared with single thresholding method based on maximum entropy and multilevel thresholding method The proposed multi thresholding method is tested on license plate application. From the experiment, multi-threshold method further improved to increase the segmentation accuracy in the future.
2011 Character recognition based on global feature extraction, 3rd International Conference on Electrical Engineering and Informatics (ICEEI 2011), Bandung, Indonesia

This paper presents a enhanced feature extraction method which is a combination and selected of two feature extraction techniques of Gray Level Co occurrence Matrix (GLCM) and Edge Direction Matrixes (EDMS) for character recognition purpose. It is apparent that one of the most important steps in a character recognition system is selecting a better feature extraction technique, while the variety of method makes difficulty for finding the best techniques for character recognition. The dataset of images that has been applied to the different feature extraction techniques includes the binary character with different sizes. Experimental results show the better performance of proposed method in compared with GLCM and EDMS method after performing the feature selection with neural network, bayes network and decision tree classifiers.

2010 Adaptive image segmentation based on Peak Signal to Noise Ratio for a license plate recognition system, International Conference on Computer Applications and Industrial Electronics (ICCAIE 2010), Kuala Lumpur, Malaysia
The objective of this paper is to propose an adaptive threshold method based on peak signal to noise ratio (PSNR). Nowadays, PSNR has been widely used as stopping criteria in multi level threshold method for segmenting images. Alternatively, we apply the PSNR as criteria to find the most suitable threshold value. We evaluate this proposed method on license plate recognition application. At the same time, we compare this proposed algorithm with multi-level and multi-threshold methods as the benchmark. Via the proposed technique, it could relatively change according to environment such as when there is a high or low contrast situation.
In conclution, a Malaysian LPR system developed recently uses a fixed threshold to segment the number plate and the characters. The experiments proved that via a Taylor-made thresholding method, the algorithm can be improved significantly. Clearly that the propose methods has been tested on off-line processing of images. Also this algorithm for thresholding give reliable accuracy for LPD. Another advantage of this proposed approach, is that the adaptive threshold values can relatively change according to environment when there is a high or low contrast situation such as during night, mid-day, underground and rainy day.

2010 Multi-threshold approach for license plate recognition system, International Conference on Signal and Image Processing WASET Singapore, ICSIP 2010:1046-1050 - , Singapore
The objective of this paper is to propose an adaptive multi threshold for image segmentation precisely in object detection. Due to the different types of license plates being used, the requirement of an automatic LPR is rather different for each country. The proposed technique is applied on Malaysian LPR application. It is based on Multi Layer Perceptron trained by back propagation. The proposed adaptive threshold is introduced to find the optimum threshold values. The technique relies on the peak value from the graph of the number object versus specific range of threshold values. That proposed approach has actually increased the overall performance compared to current optimal threshold techniques. Further improvement on this method is in progress to accommodate real time system specification.
A Malaysian LPR system developed recently uses a fixed threshold to segment the number plate and the characters. Experiment proved that via a Taylor-made thresholding method, the algorithm can be improved significantly. Clearly that the propose methods has been tested on off-line processing of images. Another advantage of this proposed approach, is that the adaptive threshold values can relatively change according to environment when there is a high or low contrast situation such as during night, mid-day, underground and raining day.

2010 An evaluation of classification techniques using enhanced Geometrical Topological Feature Analysis, 2nd Malaysian Joint Conference on Artificial Intelligence (MJCAI 2010) Kuching Sarawak, Malaysia
In this paper, we evaluate the best classification techniques for Malaysia license plate recognition (LPR)system. We also discuss four image classification techniques that are used in contemporary LPR sys- tem worldwide. There are artificial immune recognition system, neural network, bayesian network and support vector machine. We propose and apply enhanced geometrical topological feature analysis on Malaysian character and number images as their inputs. We also explain character error analysis based on those image classification approaches. It shows that support vector machine outperforms compared to other classifiers.



Computer Vision;Deep Learning;Video analysis;OpenCV 3;C++;Ubuntu;DIGITS;Caffe;Recurrent Neural Networks (RNNs);Long Short-Term Memory (LSTM);Gated Recurrent Units (GRU);classify action of human in video;deep learning classifications (LeNet, AlexNet, GoogLeNet and VGGNet);histograms of optical flow orientation and magnitude;Event Recognition in Surveillance Video Activity;Convolutional neural network (CNN);video and image stabilization, depth map, depth of field, sharpness image, Motion Analysis and Object Tracking;image processing;machine vision;robotics;humanoid robot;www.tiziran.com;thresholding.
Tiziran
Back to content | Back to main menu