한글문서인지 아닌지 판별; 0 2 4,108

by 장선웅 자바 [2008.06.03 20:07:26]


KorString.zip (2,267Bytes)


/**
 *
 * 한글문서 판별
 *
 */
public class KorString {
 
 //한글 유니코드 범위 (0xAC00~0xD7A3)
 private boolean isKoreanChar(char ch) {  
  if ((ch < 0xAC00) || (ch > 0xD7A3))
   return false;
  
  return true;
 }
 
 public boolean isKorean(String str) {
  
  for(char ch: str.toCharArray()) {
   if(isKoreanChar(ch))
    return true;
  }
  
  return false;
 }
 
 public static void main(String[] args) {
  KorString check = new KorString();
 
  
  /**
   * 테스트
   */
  System.out.println(check.isKorean("한글입니다.")); // true
  System.out.println(check.isKorean("abcdef"));  // false
  System.out.println(check.isKorean("abcd한글ef123.")); //true
  
  /**
   * 장문 테스트
   */
  StringBuffer sb = new StringBuffer();
  
  sb.append("love  /lv/   (loves, loving, loved)"); 
  sb.append("1. VERB");
  sb.append("If you love someone, you feel romantically or sexually attracted to them, and they are very important to you.");
  sb.append("    Oh, Amy, I love you.");
  sb.append("    We love each other. We want to spend our lives together.");  
  sb.append("2. N-UNCOUNT");
  sb.append("Love is a very strong feeling of affection towards someone who you are romantically or sexually attracted to.");
  sb.append("    Our love for each other has been increased by what we’ve been through together.");
  sb.append("    a old fashioned love story.");
  sb.append("    an album of love songs.");
  sb.append("3. VERB");
  sb.append("You say that you love someone when their happiness is very important to you, so that you behave in a kind and caring way towards them.");
  sb.append("    You’ll never love anyone the way you love your baby.");
  sb.append("4. N-UNCOUNT");
  sb.append("Love is the feeling that a person’s happiness is very important to you, and the way you show this feeling in your behaviour towards them.");
  sb.append("    My love for all my children is unconditional.");
  sb.append("    She’s got a great capacity for love.");
  sb.append("5. VERB");
  sb.append("If you love something, you like it very much.");
  sb.append("    We loved the food so much, especially the fish dishes.");
  sb.append("    I loved reading.");
  sb.append("    one of these people that loves to be in the outdoors.");
  sb.append("    I love it when I hear you laugh.");
  sb.append("6. VERB");
  sb.append("You can say that you love something when you consider that it is important and want to protect or support it.");
  sb.append("    I love my country as you love yours.");
  sb.append("7. N-UNCOUNT : oft N of n");
  sb.append("Love is a strong liking for something, or a belief that it is important.");
  sb.append("    The French are known for their love of their language.");
  sb.append("8. N-COUNT : usu with poss");
  sb.append("Your love is someone or something that you love.");
  sb.append("    `She is the love of my life,’ he said.");
  sb.append("    Music’s one of my great loves.");
  sb.append("9. VERB");
  sb.append("If you would love to have or do something, you very much want to have it or do it.");
  sb.append("    I would love to play for England again.");
  sb.append("    I would love a hot bath and clean clothes.");
  sb.append("    His wife would love him to give up his job.");
  sb.append("10. N-VOC");
  sb.append("Some people use love as an affectionate way of addressing someone.[ BRIT, INFORMAL ]");
  sb.append("    Well, I’ll take your word for it then, love.");
  sb.append("    Don’t cry, my love.");
  sb.append("    dear, darling   ");
  sb.append("11. NUM");
  sb.append("In tennis, love is a score of zero.");
  sb.append("    He beat Thomas Muster of Austria three sets to love.");
  sb.append("12. CONVENTION");
  sb.append("You can use expressions such as `love’, `love from’, and `all my love’, followed by your name, as an informal way of ending a letter to a friend or relation.");
  sb.append("    with love from Grandma and Grandpa.");
  sb.append("13. N-UNCOUNT : poss N");
  sb.append("If you send someone your love, you ask another person, who will soon be speaking or writing to them, to tell them that you are thinking about them with affection.");
  sb.append("    Please give her my love.");
  sb.append("14.   see also  -loved, loving, free love, peace-loving, tug-of-love");
  sb.append("15. PHRASE : V inflects, oft PHR with n");
  sb.append("If you fall in love with someone, you start to be in love with them.");
  sb.append("    I fell in love with him because of his kind nature.");
  sb.append("    We fell madly in love.");
  sb.append("16. PHRASE : V inflects, usu PHR with n");
  sb.append("If you fall in love with something, you start to like it very much.");
  sb.append("    Working with Ford closely, I fell in love with the cinema.");
  sb.append("17. PHRASE : V inflects, oft PHR with n");
  sb.append("If you are in love with someone, you feel romantically or sexually attracted to them, and they are very important to you.");
  sb.append("    Laura had never before been in love.");
  sb.append("    I’ve never really been in love with anyone.");
  sb.append("    We were madly in love for about two years.");
  sb.append("18. PHRASE : V inflects, usu PHR with n");
  sb.append("If you are in love with something, you like it very much.");
  sb.append("    He had always been in love with the enchanted landscape of the West.");
  sb.append("19. PHRASE : V inflects, oft pl-n PHR, PHR to/with n");
  sb.append("When two people make love, they have sex.");
  sb.append("    Have you ever made love to a girl before?.");
    
//  TickObject tick = new TickObject(); // 시간측정
//  tick.start();  
  System.out.println(check.isKorean(sb.toString())); // false
//  tick.printlab(); // responsed time : 0 ms. mem: 245,232 (alloc: 245,232 free: 1,786,384) // 0초걸렸어;  
 }
 
}

요렇게 영어로 된문장은 아예접근을 못하게 하는것도 방법인것 같네요~;
(영어로된 사이트에 들어가서 한글이 없으면 익스플로어 닫는것처럼...^^;)

밥먹고 소화시키면서 해봤습니다.
(단점이라면 한글유니코드 안에 들어오는 스펨글<중국, 일본 타이완> 은 못막네요;^^;)

게시판에 성격이안맞으면 삭제해주세요~

by 김정식 [2008.06.04 10:36:54]
영어만 적용하면 링크만 있는 게시글도 등록이 안되고,
한글 관련된 스팸글도 굉장히 많거든..
그래서 현재 오라클클럽에 구현되어 있는 스팸 차단 방법을 간단히 살펴보면..
3가지로 진행되거든..

1. 아이피 차단.
아파치에서 Deny from 절에 아이피 등록해서 차단.

2. 스팸 차단.
스팸 키워드를 작성하고, 그 키워드를 정규식을 사용하여 패턴매치 해서 스팸인지 검사하는 방법을 사용.
게시글 본문, 제목, 덧글 본문 이렇게 적용되어 있음.

3. XSS 차단
자바 스크립트 및 불필요한 태그 차단, 그리고 특수문자를 웹에서 사용 할 수 있게 변환 작업 등을 함
이것도 정규식을 사용하여 패턴 검사를 함.

이렇게 적용을 해 놓고..
다른 스팸이 발생을 하면, 스팸 차단 목록에 등록하고, 아파치에서 아이피 차단 목록에 추가하고..
이렇게 적용하고 있어..

by 장선웅 [2008.06.04 11:52:53]
결국은 형말대로, 규칙을 만든방법밖에 없겠네. ㅠ.ㅠ
사이트 관리란 참 신경을 많이써야되는구나.
성실히 하는 모습 귀감이되~^^
댓글등록
SQL문을 포맷에 맞게(깔끔하게) 등록하려면 code() 버튼을 클릭하여 작성 하시면 됩니다.
로그인 사용자만 댓글을 작성 할 수 있습니다. 로그인, 회원가입