To leave a review, you can click here, or go to "Leave a Review".
free online repair manuals automotive
free online repair manuals automotive
File Name:free online repair manuals automotive.pdf
Size: 4360 KB
Type: PDF, ePub, eBook
Uploaded: 23 May 2019, 20:53 PM
Rating: 4.6/5 from 678 votes.
Last checked: 7 Minutes ago!
In order to read or download free online repair manuals automotive ebook, you need to create a FREE account.
eBook includes PDF, ePub and Kindle version
free online repair manuals automotiveAll rights reserved. This Manual is for the sole use of designated educators and must not be distributed to students except in short, isolated portions and in conjunction with the use of Pattern Classi?cation (2nd ed.) June 18, 2003 Preface In writing this Solution Manual I have learned a very impor tant lesson. As a student, I thought that the best way to master a subject was to go to a superb university and study with an established expert. Later, I realized instead that the best way was to teach a course on the sub ject. Yet later, I was convinced that the best way was to write a detailed and extensive textbook. Now I know that all these years I have been wron g: in fact the best way to master a subject is to write the Solution Manual. In solving the problems for this Manual I have been forced to confront myriad technical details that might have tripped up the unsuspecting student. Students and teachers can thank me for simplifying or screening out problems that required pages of unenlightening calculations. Occasionally I had to go back to the text and delete the word “easily” from problem references that read “it can easily be shown (Problem.).” Throughout, I have tried to choose data or problem conditions that are particularly instructive. In solving these I have found in early drafts of this text (as well as errors in books byproblems, other authors and evenerrors in classic refereed papers), and thus the accompanying text has been improved for the writing of this Manual. I have tried to make the problem solutions self-co ntained and self-explanatory. I have gone to great lengths to ensure that the solutions are correct and clearly presented — many have been reviewed by students in several classes. Surely there are errors and typos in this manuscript, but rather than editing and rechecking these solutions over months or even years, I thought it best to distribute the Manual, however ?awed, as early as possible.
- free online automotive repair manuals, free online repair manuals automotive, free online repair manuals automotive parts, free online repair manuals automotive repair, free online repair manuals automotive technicians, free online repair manuals automotive parts store, free online repair manuals automotive tools, free online repair manuals automotive shops, free online repair manuals automotive repair shops, free online repair manuals automotive equipment, free online automotive repair manuals, automotive repair manuals online for free.
I accept responsib ility for these inevit able errors, and humbly ask anyone ?nding them to contact me directly. (Please, however, do not ask me to explain a solution or help you solv e a problem!) It should be a small matter to change the Manual for future printings, and you should contact the publisher to check that you have the most recent version. Notice, too, that this Manual contains a list of known typos and errata in the text which you might wish to photocopy and distribute to students. I have tried to be thorough in order to help students, even to the occassional fault of verbosity. You will notice that several problems have the simple “explain your answer in words” and “graph your resu lts.” These were added for student s to gain intuition and a deeper understanding. Graphing per se is hardly an intellectual challenge, but if the student graphs functions, he or she will develop intuition and remember the problem and its results b etter. Furthermore, when the student later sees graphs of data from dissertation or research work, the link to the homework problem and the material in the text will be more readily apparent. Note that due to the vagaries of automatic typesetting, ?gures may appear on pages after their reference in this Manual; be sure to consult the full solution to any problem. I have also included worked examples and so sample ?nal exams with solutions to 1 2 cover material in text. I distribute a list of important equations (without descriptions) with the exam so students can focus understanding and using equations, rather than memorizing them. I also include on every ?nal exam one problem verbatim from a homework, taken from the book. I ?nd this motivates students to review careful ly their homework assignments, and allows somewhat more di?cult problems to be included. These will be updated and expand ed; thus if you have exam questions you ?nd particularly appropriate, and would like to share them, please send a copy (with solutions) to me. It should be noted, too, that a set of overhead transparency masters of the ?gures from the text are available to faculty adopt ers. I have found these to b e invaluable for lecturing, and I put a set on reserve in the library for students. Numerous students and colleague s have made suggestions. Especially noteworthy in this regard are Sudes hna Adak, Jian An, Sung-Hyuk Cha, Koichi Ejiri, Rick Guadette, John Heumann, Travis Kopp, Yaxin Liu, Yunqian Ma, Sayan Mukherjee, Hirobumi Nishida, Erhan Oztop, Steven Rogers, Charles Roosen, Sergio Bermejo Sanchez, Godfried Mohammed Yousuf and Yu Zhong. Thanks too go to Toussaint, Dick DudaNamrata who gaveVaswani, several excellent suggestions. I would greatly appreciate notices of any errors in this Manual or the text itself. I would be especially grateful for solutions to problems not yet solved. Please send any such information to me at the below address. I will incorporate them into subsequent releases of this Manual. This Manual is for the use of educators and must not be distributed in bulk to students in any form. Short excerpts may be photocopied and distribut ed, but only in conjunction with the use of Pattern Classi?cation (2nd ed.). I wish you all the best of luck in teaching and research. Thus the bound on x. Our goal is to show that Z is also normally distributed.This does not change the derivatives. Moreover, since we are deali ng with a single Gauss ian, we can dispense with the needless subscript indica ting the category. As we saw above, ri2 points in the same direction along any line. Equation 65 in the text states. We can understand that unusual by analogy to a one-dimensional problem that has a decision region consisting of a single point, as follows. (a) Suppose we have two one-dime nsional Gaussians, of p ossibly di?erent means and variances. The minimum probability of error is achieved by the following decision rule: Choose. For the q ij term, the simplest exponent is 1 x2i, and so on.http://fscl.ru/content/druck-adts-500-manualThus we have Indeed, in the case p d k.Thus we compute numerically. Now we infer the probability P (ai x1 ).Therefore, our likelihood. If the optim al solut ion does not lie in the solution space, then the proo fs do not hold. This is actually a very strong restriction. Note that obtaining the error equal 100 solution is not dependent upon limited data, or getting caught in a local minimum — it arises because of the error in assuming that the optimal solution lies in the solution set. Now we turn to the covariance of the secon d distribution.The Bayesian learning method considers the parameter vector.The posterior densit y p(x ) also depends upon the probability density p(?) distributed over the entire.One is the computation of the probability density.We are to suppose that s is a statistic for which p(.Given one operation per nanosec ond, the ? ? total number of basic operations in the periods are: 10 9 in one second; 3.6 1012 in one hour; 8.64 1013 in one day; 3.156 1016 in one year. One possib le advantage of the non- recursive method is that it might avoid compounding roundo. Consider the formula 1 for C.The complexity 1 2 associated with determining C. Then we write the criterion function as a Rayleigh quotient w t SB w.D D PROBLEM SOLUTIONS 121 (a) The expected value of. We consider an iterative algorithm in which the maximum-likelihood value of missing val ues is calculated, then assumed to be correct for the purposes of reestimating. Similarly, ?ij ’s can be computed by O(c2 T ) operations given ?i (t)’s, aij ’s, bij ’s, ?i (t)’s and P (V T M ). P (V1, V2, hidden state.Computer exercise not yet solved Section 3.9 11. Computer exercise not yet solved 12.From ( ) and ( n ? ? 0, ( ) hn) n (c) We write the bias as p(x). A unique hyperplane separates the space into those that are closer to x? than to y, as shown in the ?gure. Consider any tw o points x(0) and x(1) inside the Voronoi cell of x ?; these points are surely on the side of the hyperplane nearest x. Furthermore, the result holds for every other sample point y i. Thus x(?) remains closer to x.By our de?nition of convexity, we have, then, that the Voronoi cell is convex. 8. It is indeed possible to have the nearest-neighbor error rate P equal to the Bayes error rate P. Note that this automatically imposes the restriction 0 cr c 1.Now there are scores of algorithms available for both problems all with di?erent complexities. A theorem in the book by Preparata and Shamos refers to the complexity of the Voronoi diagram itself, which is of course a lower bound on the complexity of computing it. This complexity was solved by Victo r Klee, “On the complexity of d-dimensional Voronoi diagrams,” Archiv.One of these distanc es is the greatest.In contrast, the L? distance to the closest of the faces of the hypercube approaches 0.0, because we can nearly always ?nd an axis for which the distance to a face is small. Thus, nearly every point is closer to a face than to another randomly selected point. In short, nearly eve ry point is on the “outside” (tha t is, on the “convex hull”) of the set of points in a high-dimensional space — nearly every point is an “outlier.” We now demonstrate this result formally.The probability that any of the coordinates is closer to a face than l. Therefore, our discussion above is “pessimistic,” that is, if x is closer to a face than to point y according to the L? metric, than it is “even closer” to the wall than to a face in the L2 metric. Thus our conclusion above holds for the Euclidean metric too.Note that protot ype f does not contribute to the class boundary due to the existence of points e and d. Hence f should be removed from the set of prototypes by the editing algorithm (Algorithm 3 in the text). However, this algorithm detects that f has a prototype PROBLEM SOLUTIONS 157 from another class (prototype c) as a neighbor, and thus f is retained according to step 5 in the algorithm. Consequently, the editing algorithm does not only select the points a, b, c, d and e which are are the minimum set of points, but it also retains the “useless” prototype f.If an error is detected, the prototype under consideration must be kept, since it may contribute to the class boundarie s. Otherwise it may be removed. This procedure is repeated for all the training patterns. Given a set of training points, the solution computed by the sequential editing algorithm is not unique, since it clearly depends upon the order in which the data are presented. This can be seen in the following simp le example in the ?gure, where black points are in ?1, and white points in ?2. If d is the ?rst point b a d f c e presented to the editing algorithm, then it will be removed, since its nearestneighbor is f so d can be correctly classi?ed. Then, the other points are kept, except e, which can be removed since f is also its nearest neighbor. Suppose now that the ?rst point to be considered is f. Then, f will be removed since d is its nearest neighbor. However, point e will not be deleted once f has been CHAPTER 4. NONPARAMETRIC TECHNIQUES 158 removed, due to c, which will be its nearest neighbor. This happens if and only if D(a, b) is a metric. Thus we ?rst check whether D obeys the prope rties of a metric. First, D must be non-negative, that is, D(a, b) 0, which indeed holds because the sum of squares is non-negative.Problem not yet solved 23. Problem not yet solved 24. Problem not yet solved 25. As we can see, d subtractions, d multiplications and d additions are needed. Consequently, this is an O(d) process. (b) Given a text sample x and a prototype x ?, we must compute r non-linear transforms, which depend on a set of parameters subject to optimization. Then a test sample will be classi?ed in 21 minutes and 30 seconds. CHAPTER 4. NONPARAMETRIC TECHNIQUES 164 26. Explore the e?ect of r on the accuracy of nearest-neighbor search based on partial distance. (a) We make some simpli?cations, speci?cally that points are uniformly distributed in a d-dimensional unit hypercube. Consider a single dimension. Since the terms na and nb appear in the nume rator and denominator in the same way, only the term n ab can a?ect the ful?lment of this property. However, n ab and n ba give the same measure: the number of elements common to both and. Thus the Tanimoto metric is indeed symmetric. Thus we can conclude that the Tanimoto metric also obeys the triangle inequality. (b) Since the Tanimoto metric is symmetric, we consider here only 15 out of the 30 possible pairings. We could reasonably suppose that if the “hot ” class has something to do with temperature then it would have into account the fuzzy feature “temperature” for its membership function. Accordingly, values of “temperature” such as “warm” or “hot” might be active in some degree for the class “hot” though the exact dependence between them is unknown for us without more information aboute the problem. 29. Consider “fuzzy” classi?cation. (a) We ?rst ?t the designer’s subjective feelings into fuzzy features, as suggested by the below table. Since all the discriminant functions are equal to zero, the classi?er cannot assign the input pattern to any of the existing classes. (d) In this problem we deal with a handcrafted classi?er. The designer has selected lightness and length as features that characterize the problem and has devined several membership functions for them without any theoretical or empirical justi?cation. Moreover, the conjunction rule that fuses membership functions of the features to de?ne the true discriminant functions is also imposed without any principle. Consequently, we cannot know where the sources of error come from. 30. Problem not yet solved Section 4.8 31. If all the radii have been reduced to a value less than.Thus their densities have the same val ue along this vertical line, and as a result this is the Bayesian decision boundary, giving minimum classi?cation error. R1 R2 ?1 ?2 (b) Of course, if two unimodal distributions overlap signi?cantly, the Bayes error will be high. As shown Figs 2.14(rather and 2.5than in Chapter the (unimodal) Gaussian case generally has inquadratic linear)2, Bayes decision boundaries. In fact, for the case shown, the optimal boundary is a circle. Furthermore, it is true for any category.For instance, hyperplane H12 would place A in ?2, point B in ? 1, and point C in ? 2. The underlining indicates the winning votes in the full classi?er system. Thus point A gets three ? 1 votes, only one ? 2 vote, two ? 3 votes, and so forth.We shall show that by an example that, conversely, linearly separable samples need not be totally linearly separable. But by our hypothesis, a 1 and a 2 are minimum vector such that ao length solution vectors. Therefore, the minimum -length solution vector is unique. However, a classi?er that is a piecewise linear machine with the following discriminant functions. We can, in fact, eliminate the w t x terms with a proper translat ion of axes. We seek a translation, described by a vector m, that eliminates the linear term. Because the de?nite, which it will be eigenvalues are positive and not necessarily all equal to each other, in this case the decision boundary is a hyperellipsoid. ? (c) With an argument following that of part (b), we can represent the characteristics d of the decision boundary with.This implies that all the samples will be classi?ed correctly. Section 5.6 20. As shown in the ?gure, in this case the initial weight vector is marked 0, and after the successive updates 1, 2,. 12, where the updat es cease.We can ?nd an expression for the learning rate.Case 1: Suppose the samples are linearly separable.Thus ? 1 samples must lie on the same side of a separating hyperplane in order to ensure ?2 samples lie in the other half space. This is guaranteed in the shaded (union) region. We are asked to de?ne the region where there is no shading.The second term in the sum is independent of weight vector ai. Perceptron case We wish to show what the Fixed-increment single sample Perceptron algorithm given in Eq. 20 in the text does to our transformed problem. We shall use a subscript to denote whic h G i that g k belongs to; for instance, gik is in Gi.An update takes place when the sample g k is incorrectly classi?ed, that is, when ? t (k)gk t i t j ? (a (k)y ? a (k)y) In the update rule. But this equation is the same as that of a two-layer network having connection matrix W 3. Thus a three-layer network with linear units throughout can be implemented by a two-layer network with appropriately chosen connections. Clearly, a non-linearly separable problem cannot be solved by a three-layer neural network with linear hidden units. To see this, suppose a non-linearly separable problem can be solved by a three-layer neural network with hidden units. Then, equivalently, it can be solved by a two-layer neural network. Then clearly the problem is linearly separable. But, by assumption the problem is only non-linearly separable. Hence there is a contradiction and the above conclusion holds true. 2. Fourier’s shows that a three-layer neural anetwork with sigmoidal units can acttheorem as a universal approximator. Fourier’s theorem states z(x) ? ?? f1 Af1 f2 cos(f1 x1 )cos(f2 x2 ). Each iteration involves computing. Moreover, ? wjk ?k is computed in k 2c operations. Thus, the time for one iteration is the time to compute. The larger the magnitude of xi, the larger the weight change. On the othe r hand, for a ?xed input unit i, ?wji ? j, the sensitivity of unit j. Now, the weights are chosen to minim ize J. Clearly, a large change in weight small shouldchanges occur only to signi?cantly decrease J to ensure convergence in the algorithm, in weight should correspond to, small changes in J. But J can be signi?cantly decreased only by a large change in the output ok. So, large changes in weight should occur where the output varies the most and least where the output varies the least. The number of units in the ?rst hidden layer is nH1 and the number in the second hidden layer is n H2. Clearly, whatever the topology of the srcinal network, setting the wji to be a constant is equivalent to changing the topology so that there is only a single input unit, whose input to the next layer is xo. As, a result of this loss of one-layer and number of input units in the next layer, the network will not train well. 13. If the labels on the two hidden units are exchanged, the shape of the error surface is una?ected. Consider a d nH c three-layer network. From Problem 8 part (c), we know that there are n H !2nH equivalent relabelings of the hidden units that do not ? ? a?ect the network output.But, for all p(x), we have. If the assumption is not met, the gradient descent procedure yields the closest projection to the posterior probability in the class spanned by the network. 20. Recall the equation t. This actually restricts the additive models. To have the full power of three-layer neural networks we must assume that fi is multivariate func tion of the inputs. We consider the computational burden for standardizing data, as described in the text. (a) The data should be shifted and scaled to ensure zero mean and unit variance (standardization). The ?rst te rm isThe for the output-to-hidden and the requires second term hidden-to-input weights. Since each hidden unit has prespeci?e d inputs i and m, we use a subscript on i and m to denote their relation to the hidden unit. Remember that each hidden unit j has three weights or parameters for the two speci?ed input units i j and m j, namely w ji j, w jm j, and q j. Another weakness is that the hidden lay er outputs are not bounded anymore and hence create problems in convergence and numerical stability. Consider classi?cation task. In essence the network does a linear classi?cation followed by a sigmoidal non-linearity at the output. In order for the network to be able to perform the task, the hidden unit outputs must be linearly separable in the F space. However, this cannot be guaranteed; in general we need a much larger function space to ensure the linear separability. The primary advantage of this schem e is due to the task domain. If the task domain is known to be solvable with such a network, the convergence will be faster and the computational load will be much less than a standard backpropagation learning network, since the number of input-to-hidden unit weights is 3nH compared to dn H of standard backpropagation. 30. The solution to Problem 28 gives the back propagation learning rule for the Minkowski error. In this problem we will derive the learning rule for the Manhattan metric directly. Equation 56 in the text is trivially satis. The Quickprop algorithm assume s the weights are independent and the error surface is quadratic. Assume w 1 (t) also assures maximum output z.Let u q be the unit vector along the q th direction in weight space. Consider weight decay and regularization. (a) We note that the error can be written. Clearly this prior favors small weights, that is, 2 is large if wij is small. The joint density of x, w is. Given the de?nition of the criterion function.If we choose the q th component, then.Since we want to minimize. Consider a simple 2-1 network with bias. We assume that each character has an equal chance of being typed. We assume that the typing of each character is an independent event, and thus the probability of typing any particular string of length m is r m.In other words, all of the feasible solutions are at the corners of a hypercube, and are of equal distance to the “middle” (the center) of the cube. Their distance from the center is N. v (c) Along any “cut” in the feature space paralle l to one of the axes the energy will be monotonic; in fact, it will be linear. This is because all other features are held constant, and the energy depends monotonically on a single feature. Since the connec tion matrix is sym-. However, since t(N ) is monotonically increasing with respect to N, a simple search can be used to solve the problem; we merely increase N until the corresponding time exceeds the speci?ed time. A straightforward calculation for a reasonable N would over?ow on most computers if no special software is used. We assume the other N 1 magnets produce an average magnetic ?eld. In another word, we consider the magnet in question, i, as if it is in an external magnetic ?eld of the same strength as the average ?eld. Therefore, we can calculate the probabilities in a similar way to that in part (b). Therefore the energy is N. This can be shown as follows. First, each network without the constant unit can be converted to a network with the constant unit by including the constant node 0 and assigning the connection weights w0j a zero value. Second, each network with the constant unit can also be converted to a network without the constant unit by replacing the constant unit with a pair of units, numbered 1 and 2. Node 1 assumes the same connection weights to other units as the constant unit 0, while node 2 is only connected to node 1. The ? ? ? ? ? ? ? connection weights between nodes 1 and 2 are a vary large positive number M. In English, howe ver, there is usually more than one way to pronounce a single number. For instance, 2000 can be pronounced two thousand or twenty hundred, however the grammar above does not allow this second pronunci ation. 35. Problem not yet solved 36. Thus this grammar is not in CNF. (b) Here the rewrite rules of G. Now we expand our inte rmediate Ca variables to include this new C a, and the rewrite rules accordingly to form G 2. This result is obvious for ?nal strings of lengt h 1. Suppose now that it is true for derivations requiring k steps. Consider the Ugly Duckling Theorem (Theorem 9.2). (a) If a classi?cation problem has constraints placed on the features, the number of patterns will b e less than that of an unconstrained problem. To determine similarity, we count the number of predicates shared by two cars. With car A and car B, the predicates are x1 OR x2, and x1 OR x2 OR x3.With car B and car C, the predicates are Since the number of predicates is the same the patterns are “equally similar,” according to the de?nition given in the text. In this framework, the patters would be: x1: NOT f 1 AND f 2 AND NOT f 3 x2: NOT f 1 AND NOT f 2 AND f 3 x3: f1 AND NOT f 2 AND NOT f 3. The result is that the three patterns being di?erent from one another by only one feature, and hence cars A and B are equally similar as cars B and C. 12. Problem not yet solved 13. This does not chan ge the complexity, and the answer is O(1). (f) As in part (e), but now we m ust specify n, with has complexity O(log2 n). 14. Problem not yet solved 15. The de?nition “the least number that cannot be de?ned in less than twenty words” is already a de?nition of less than twenty words. The de?nition of the Kolmogorov complexity is the length of the shortest program to describe a string. We now apply this result to the function g(x). The basic operation of the algorithm is the computation of the distance between a sample and the center of a cluster which takes O(d) time since each dimension needs to be compared seperately. During each iteration of the algorithm, we have to classify each sample with respect to each cluster center, which amounts to a total number of O(nc) distance computations for a total complexity O(ncd). Each cluster center than needs to be updated, which takes O(cd) time for each cluster, therefore the update step takes O(cd) time.The gray region repre sents those assigments that are valid, that is, assignments in which none of the Ci are empty. In clustering, no such labeling is given. This total number is cc in. D D D D D (a) We de?ne the value vk at level k to be min ? ( i, j ), where ? ( i, j ) is the dissimilarity between pairs of clusters i and j. Then n c and there must exist at least one subset in the partition containing two or more samples; we call that subset j. But by assumptio n the partition chosen minimized J e, and hence we have a contradiction. Thus there can be no empty subsets in a partition that minimizes J e (if n c). This is because S B is often singular, even if samples are not from a subspace. Even when SB is not singular, some ?i is likely to be very small, and this makes the product small. Hence the optimal clustering based on J d is always the optimal clustering based on J d?. Optimal clustering is invariant to non-singular linear transformations of the data. 334CHAPTER 10. UNSUPERVISED LEARNING AND CLUSTERING 28. D D ? being transferred from i to j.We let Je? be the criterion function which results from transferring a sample i to j. We now use the H?older inequality to prove that the triangle inequality holds for the Minkowski metric, a result known as Minkowski’s inequality. PROBLEM SOLUTIONS 349 (a) The following is known as Kruskal’s minimal spanning tree algorithm. If we compute all distances on demand, we only need to store the spanning treee T and the clusters, which are both at most of length n. (c) The time complexity is dominated by the initial sorting step, which takes O(n2 logn) time. We have to examine each edge in the for loop and that means we have O(n2 ) iterations where each iteration can be done in O(logn) time assuming we store the clusters as disjoint-set-forests where searches and unions can be done in logarithmic time. Thus the total time compl exity of the algorithm is O(n2 logn). Section 10.12 43. Consider the XOR prob lem and whether it can be implemented by an ART network as illustrated in the text. (a) Learning the XOR problem means that the ART network is supposed to learn the follo wing input-output relat ion. We augment all the input vectors and normalize them to unit weight wo that they lie on the unit sphere in R 3. Thus there is no set of weights that the ART network could converge to that would implement that XOR solution. (b) Given a weight vector w for cluster C1, the vigilance parameter. The weight vector w is now slightly adjusted and now also x2 is close enough to w to be classi?ed as belonging to cluster C 1, thus we will not create another cluster. In constrast, if we present x 2 before x 1, the angle between w and x 2 is smaller than ?, thus we will introduce a new cluster C 2 containing x 2. Due to the ?xed value of ? but weights that are changing in dependence on the input, the number of clusters depends on the order the samples are presented. ? ? (c) In a stationary environment the ART network is able to classify inputs robustly because the feedback connections will drive the network into a stable state even when the input is corrupted by noise. In a non-stationary environment the feedback mechanism will delay the adaptation to the changing input vectors because the feedback connections will interpret the changing inputs as noisy versions of the srcinal input and will try to force the input to stay the same.Since ? is symmetric and positive semi-de?nite, we know that all eigen values are real and greater than or equal to zero.Estimate k such that the expected error is 25. Discuss your results.Erfahren Sie, wie wir und unser Anzeigenpartner Google Daten sammeln und verwenden. Cookies zulassen. Some features of WorldCat will not be available.By continuing to use the site, you are agreeing to OCLC’s placement of cookies on your device. Find out more here. However, formatting rules can vary widely between applications and fields of interest or study. The specific requirements or preferences of your reviewing publisher, classroom teacher, institution or organization should be applied. Please enter recipient e-mail address(es).